How Social Workers Can Evaluate AI Documentation Tools: A Privacy and Compliance Checklist

NASW has published no AI-specific documentation guidance since 2017. This guide fills that vacuum. Learn what social workers should look for in AI documentation tools: data handling, BAA availability, court subpoena exposure, 42 CFR Part 2, Medicaid audit risk, and a printable evaluation checklist.

As of early 2026, the National Association of Social Workers has not published specific guidance on AI documentation tools. The last substantive technology standards update from NASW was in 2017, before large language models entered the clinical workspace. The British Association of Social Workers published initial AI guidance in March 2025. No U.S. equivalent exists.

That vacuum has consequences. Social workers are adopting AI documentation tools individually, inconsistently, and often without a framework for evaluating the compliance and privacy risks specific to their profession. A therapist evaluating an AI tool can reference reasonably clear HIPAA guidance and a growing body of peer conversation about what "signed BAA" means in practice. A social worker has the same HIPAA obligations, plus Medicaid audit exposure, court subpoena risk, and, in SUD settings, the additional layer of 42 CFR Part 2 protections, without a professional body that has told them what questions to ask.

This guide exists to fill that gap. It is not a product review. It is a structured evaluation framework you can use before adopting any AI documentation tool, or for reviewing a tool you are already using.

The documentation stakes in social work are different from those in most other clinical disciplines, in ways that matter specifically when an AI system is involved in generating or formatting your notes.

Court and legal exposure is routine, not rare

Therapy notes become legal documents in unusual circumstances. Social work records become legal documents regularly. Dependency court hearings, custody evaluations, child welfare investigations, parole and probation compliance reviews, housing court proceedings: all of these can result in subpoenas for case notes. In child welfare settings, a social worker may have 30 or more court dates per quarter.

When an AI system contributed to the language in a note, and that note is now evidence in a proceeding, the question of who is responsible for its accuracy becomes real. Social workers need to be able to testify that the note reflects their direct observation and professional judgment. If an AI tool introduced language they did not write and did not review carefully, that creates exposure.

Medicaid audit risk is structural, not occasional

Social workers who bill Medicaid, or who work in agencies where Medicaid is the dominant payer, face post-payment audits conducted by state contractors or Recovery Audit Contractors. These reviewers are not checking for clinical quality. They are checking whether the documentation supports the billed code. If a progress note was substantially generated by an AI and contains language that does not match the actual service delivered, that is a recoupment trigger, not just a documentation problem.

The notes you write today may be audited 18 months from now. The AI tool you used to write them may no longer exist as a company by then, and its data retention policies may have expired. You need to know whether your documentation stands on its own.

42 CFR Part 2 creates heightened confidentiality obligations for SUD cases

Social workers who document substance use disorder (SUD) treatment are subject to 42 CFR Part 2, a federal regulation that imposes stricter confidentiality rules than standard HIPAA. Under 42 CFR Part 2, SUD records may not be disclosed without specific patient consent, even to other healthcare providers, with limited exceptions.

When an AI tool processes session notes from SUD treatment, the question is not just whether the tool is HIPAA-compliant. It is whether the tool's data handling satisfies 42 CFR Part 2 requirements. Most AI documentation tools do not address this distinction. Most of their marketing language does not either.

Professional accountability does not transfer

Social workers bear professional accountability for their documentation in a way that is not diminished by disclosing that a tool assisted in creating it. If a note contains a factual error that influences a court decision, a licensing board inquiry, or a Medicaid audit outcome, the social worker's license is the thing at risk. The vendor's license is not a thing that exists.

This is not an argument against using AI tools. It is an argument for understanding exactly what the tool does with your information and what it actually outputs before you adopt it into your workflow.

What to Look for in an AI Documentation Tool

Data handling: where do your notes go?

The foundational question for any AI documentation tool is what happens to the content you provide. There are meaningful differences between tools, and the marketing language does not always reflect the technical reality.

Questions to ask:

Does the tool send your typed session summaries to an external AI provider (such as OpenAI, Anthropic, or Google)? If so, under what data processing agreement?
Does the tool store your notes after generation? For how long? Can you request deletion?
Is your data used to train the vendor's AI models? Many terms of service include opt-out provisions, but the default is often opt-in.
Where are servers located? This matters for state-specific privacy regulations and for international social work contexts.

A tool may describe itself as "HIPAA-compliant" without specifying whether that compliance extends to the third-party AI providers it relies on. Ask whether the vendor has a data processing agreement or business associate agreement not just with you, but with every subprocessor that handles your data.

What a clear answer looks like: "We send your typed notes to [AI provider] for processing. That provider operates under a data processing agreement that prohibits them from using your data for model training. Notes are deleted from our servers within [X] hours of generation. We do not retain session content."

What an unclear answer looks like: "We take privacy seriously and follow all applicable regulations." Full stop.

Business Associate Agreements: what they are and why they matter

A Business Associate Agreement (BAA) is a legal contract required under HIPAA before a vendor can handle Protected Health Information (PHI) on your behalf. Without a signed BAA, using an AI tool to process clinical notes may constitute a HIPAA violation, even if the tool claims to be compliant.

Key points about BAAs in the context of AI documentation tools:

Some tools offer a BAA upon request or as part of a paid tier. Others include it automatically. Some do not offer one at all.
A BAA does not guarantee security. It allocates legal responsibility and establishes minimum standards. A signed BAA combined with weak data practices is still a risk.
If you work in an agency, your agency's IT and compliance team must review any BAA before you use a tool with client information. You likely cannot sign a BAA independently for an agency-managed system.
For private practice LCSWs, a BAA is typically your own responsibility to obtain and retain.

Before adopting any tool, ask directly: "Do you offer a signed BAA for HIPAA compliance?" If the answer is no, evaluate carefully whether using the tool with identifiable client information is defensible.

Note: Some AI documentation tools, including some lower-cost options, are transparent about not being HIPAA-compliant. This does not automatically disqualify them, depending on how you use them. A tool used for template structuring with no actual PHI entered has a different risk profile than a tool processing identified case notes. Know the difference and make a documented decision.

Data retention policies: how long is your information held?

Data retention is underexamined in vendor evaluations. Social workers should ask:

How long does the vendor retain notes, session summaries, or any inputs you provide?
What is the deletion policy when you cancel your account?
What happens to your data if the company is acquired or shuts down?
Are deletion requests honored, and how quickly?

Vague language such as "data is retained as long as necessary" is a risk signal. "As long as necessary" has no fixed endpoint and can mean indefinitely in practice. For social workers with court subpoena exposure, unclear retention language means you may have no idea what documentation exists about your clients' cases in a vendor's servers months or years later.

Compare this to tools that commit to a specific retention window ("notes are deleted within 24 hours of generation") or to zero retention ("we do not store your notes after delivery"). The latter architecture eliminates an entire category of exposure.

Audit trails: can you reconstruct what happened?

For Medicaid billing and court proceedings, you may need to demonstrate that a note was created on a specific date, by a specific provider, and was not retroactively edited. An AI documentation tool should provide:

Timestamps on note creation and any subsequent edits
Version history or at minimum a record of when notes were generated
Audit log export capability for use in compliance reviews or legal proceedings

If a tool does not maintain or export an audit trail, you are dependent on your own EHR or paper records to reconstruct the documentation history. That gap creates risk in audit and legal contexts.

Output control: who controls the final note?

This is the dimension that most directly addresses the professional accountability problem.

Some AI documentation tools operate generatively: you provide a session summary or audio recording, and the tool produces a complete note. The clinician reviews and signs it. In this model, the AI is the primary author and the clinician is the reviewer. If the AI introduced a factual error or a fabricated clinical detail, that error was present before the clinician read the note. Whether the clinician caught it is an open question.

Other tools operate within a template structure: the clinician provides the inputs, the AI formats them according to the selected template, and the output reflects only what the clinician provided. In this model, the AI is a formatter, not an author. The clinician's inputs are the only source material.

The distinction matters for accountability. A note that was formatted by AI from your own inputs is more defensible than a note that was authored by AI and reviewed by you. In court, the question is not whether AI was involved. The question is whether you can testify to the accuracy of every claim in the note.

Ask any tool you evaluate: "Does the system generate content that was not in my original input, or does it only format and structure what I provide?"

Format flexibility: does it match your documentation requirements?

Social workers use a wider range of documentation formats than most clinical disciplines. A tool that works well for SOAP progress notes may not handle:

Case management notes with collateral contact documentation
Safety assessments with structured risk rating fields
Court reports and court social summaries with legal formatting requirements
Service authorization documentation tied to Medicaid billing codes
Multidisciplinary team meeting summaries with multiple provider inputs
42 CFR Part 2-compliant SUD case notes with restricted disclosure notation

Evaluate whether a tool can adapt to these formats, or whether it is built for one format type and will require significant manual adjustment for everything else.

The Generative vs. Template-First Distinction

This deserves more attention than it typically gets in tool comparisons.

Generative AI documentation tools ingest your session content and produce a structured note. The sophistication of the output depends on the model, the prompt design, and the quality of what you provide. The risk is fabrication: language models can generate plausible-sounding clinical content that was not present in the original input. This has happened in documented cases across multiple tools.

For social workers, the depersonalization risk is specific: AI-generated notes can flatten the contextual detail that social work documentation is designed to capture. A case note that needs to reflect a client's specific housing instability, family dynamics, or cultural context may come out as generic clinical language that fits any client. That loss of specificity is not just a quality problem. In a court context or Medicaid audit, generic language raises questions about whether the note reflects an actual service encounter.

Template-first tools work differently. You define the structure. You provide the content. The AI formats and fills based on what you gave it, without generating content that was not in your input. This architecture limits fabrication risk and preserves the clinician's voice and specificity.

Neither model eliminates the need for clinician review. But they have different error profiles and different accountability implications.

Some tools in the market, including NotuDocs, use the template-first approach: you provide the session notes, the tool formats them within your chosen template structure. This does not make the tool automatically appropriate for your practice setting, but it does mean the output reflects your inputs rather than the model's inferences.

Evaluating Claims of HIPAA Compliance

The social work ICP research surfaced a recurring problem: vendors claiming HIPAA compliance without the substance to back it up. Experienced practitioners in therapist communities describe this as "HIPAA theater."

When a vendor says "we are HIPAA-compliant," that claim has several possible interpretations:

We follow security practices consistent with HIPAA requirements (a commitment to practices, not a certification)
We have conducted a formal HIPAA risk assessment and can document our compliance program
We will sign a BAA (a specific legal commitment)
We are certified under a HIPAA-aligned framework such as SOC 2 Type II

These are not equivalent. Ask specifically: "Can you provide your BAA for review? Have you completed a HIPAA risk assessment? Do you have a formal compliance program?"

A vendor that becomes defensive or vague when asked these questions has told you something useful.

Common Evaluation Mistakes

Assuming "encrypted" means "compliant"

Encryption in transit and at rest is a minimum baseline, not a compliance achievement. Most modern cloud tools are encrypted. Encryption does not address how long data is retained, what subprocessors have access, or whether your data is used for model training.

Evaluating based on marketing language alone

Tool websites describe tools in the best possible light. The privacy policy, terms of service, and data processing addendum contain the actual commitments. Read those documents or ask a knowledgeable colleague to review them.

Not involving your agency's IT and compliance team

Private practice LCSWs make individual purchasing decisions. Agency-employed social workers generally cannot. If you work in an agency, using an AI documentation tool for client work without IT and compliance review may violate your agency's policies, your liability coverage, and potentially HIPAA. Raise the question before you adopt a tool, not after.

Using free tools with PHI

Free tiers of AI tools often have different (and weaker) data handling commitments than paid tiers. Vendors may monetize free tier data through model training. A tool that costs nothing often costs something in data.

Relying on peer recommendations without independent verification

If a colleague says "I've been using this for months and nothing bad has happened," that is a survivorship signal, not a safety signal. Evaluate tools on their documented policies, not on the absence of known incidents.

Use this checklist before adopting any AI documentation tool for clinical social work.

Privacy and Data Handling

The vendor clearly explains what happens to inputs (typed notes, session summaries) after processing
The vendor specifies which third-party AI providers process your data and under what agreements
The vendor does not use your data for model training (or offers a clear opt-out with confirmation)
Data retention after generation is specified (24-hour deletion or similar) rather than vague
Server location is disclosed and consistent with applicable regulations
Account deletion results in data deletion within a specified timeframe

BAA and HIPAA

The vendor offers a signed BAA for HIPAA compliance
The BAA covers subprocessors (not just the vendor itself)
The vendor can describe their HIPAA compliance program, not just claim compliance
If BAA is not available, you have documented a decision that no PHI will be entered into the tool

42 CFR Part 2 (SUD Settings)

If you document SUD treatment, you have confirmed whether the vendor addresses 42 CFR Part 2 requirements
You have verified whether the vendor treats SUD documentation the same as or differently from other clinical notes
Your agency's compliance officer or legal counsel has reviewed the tool if you document SUD cases

Audit Trail and Documentation Integrity

Notes are timestamped at creation and edit
Version history or edit logs are available
Audit logs can be exported for compliance review or legal proceedings
The tool produces documentation you can stand behind independently of the tool's existence (i.e., if the vendor shuts down, your records are complete)

Output Control and Fabrication Risk

You understand whether the tool generates content not present in your inputs (generative) or only formats your inputs (template-first)
If generative, you have a review protocol to verify every clinical claim before signing
The tool does not introduce clinical language, diagnoses, or facts that were not in your original input
You can testify to the accuracy of every statement in a note generated with AI assistance

Format Flexibility

The tool supports the note formats your work requires (SOAP, DAP, case management notes, safety assessments, court summaries, MDT meeting notes)
Service authorization documentation fields are available or can be customized
You can add or modify templates to match your agency's specific format requirements

Organizational and Professional

You have documented which tool you are using and under what privacy terms
If agency-employed, your IT and compliance team has approved the tool
Your liability insurer has been consulted or notified (some malpractice carriers have specific guidance on AI documentation tools)
You have a process for reviewing AI-assisted notes before signing, not just signing what the tool produces

The absence of NASW-specific AI guidance is a real gap, not a bureaucratic oversight. Until that guidance exists, social workers are making consequential decisions about AI adoption without the professional scaffolding that other disciplines have. This checklist is a starting point for structured decision-making. It is not a substitute for consulting with your licensing board, your liability insurer, or your agency's compliance team when the stakes are high.

The social workers most at risk from poorly evaluated AI tools are not the ones who ignore AI entirely. They are the ones who adopt a tool based on peer recommendation and a free trial, without reading the terms of service or asking whether the tool offers a BAA.

Take an extra thirty minutes before your next tool adoption. It is less time than a Medicaid audit response takes.

Related guides: