Blog

Security Engineers: Govern AI Coding Access. Don't Just Block It.

15 min read

The CISO said no to Cursor. The engineering lead escalated. The CISO said: fine, but security needs to own the implementation. Now it is the security engineer's problem: design an AI coding tool policy that does not block developers from working, does not create ungoverned workarounds, passes the next SOC2 audit, and can be explained to the board if something goes wrong. The CISO wants a framework by end of quarter. The engineering lead wants approval by end of the month. Neither of them is wrong.

This post is for security engineers who have been handed that problem. It covers what AI coding governance actually requires, the three failure modes it is designed to prevent, why blocking without an alternative makes things worse, and what a well-architected governed setup looks like in practice — from the security controls through the vendor evaluation to the written policy.

What AI coding governance actually means

Governance is not a list of banned tools. It is a set of answers to specific, auditable questions. What code can reach what AI systems? Under what access controls is that access mediated? What audit trail exists for queries made? Under what retention and data handling policy is any processed code stored? What is the remediation path if a policy violation is detected?

The governance framework exists to answer those questions before an incident, not after one. A well-governed AI coding setup means that if a security incident occurs — a breach, a regulatory inquiry, a customer security audit — the security team can produce a clear account of which code reached which systems, under what controls, with what audit visibility. That account does not need to be clean. It needs to be accurate and documented.

Without governance, the account does not exist. Developers using personal AI tool accounts on personal devices generate no organizational audit trail at all. There is no documented scope of what was exposed, no retention policy governing what the vendor holds, and no access control mechanism that maps to organizational permissions. From an audit perspective, that is not a neutral position. It is a gap that auditors flag as a material weakness.

The three failure modes governance is designed to prevent

Every AI coding governance framework is trying to prevent some subset of three failure modes. Understanding which ones apply to your organization determines what controls are required and what can be deprioritized.

Data exfiltration

Source code from sensitive services reaching external LLM APIs without audit trail is the primary concern for most organizations. This is not hypothetical: a developer working on payment processing logic, authentication infrastructure, or proprietary business logic using an unmanaged AI tool sends that code to a third-party LLM API with whatever retention and data handling policy that provider happens to operate under. If the developer is using a free tier or a personal account, the organization has no vendor agreement governing that exposure and no audit trail documenting it.

The control is not preventing developers from using AI tools. It is ensuring that any AI tool used for sensitive code operates under a vendor agreement that specifies data handling, retention, and breach notification, and that usage is mediated by an architecture that keeps raw sensitive code within the organizational trust boundary.

Supply chain risk

AI-generated code introduces vulnerabilities that do not get caught at the dependency review level. This is distinct from the exfiltration risk. The concern here is not what leaves the organization, but what enters the codebase. Research conducted across major AI coding tools in late 2025 documented patterns of insecure code generation that passed automated security scanners: hardcoded secrets in generated test fixtures, SQL injection vectors in generated data access layers, insecure deserialization in generated API handlers. None of these were obvious to developers reviewing the generated code in a PR diff.

Governance cannot fully prevent this failure mode — AI-generated code will introduce some vulnerabilities regardless of the governance framework. What governance can do is ensure that security scanning is applied systematically to AI-generated code and that the team has visibility into which code paths were AI-generated, enabling targeted review.

Scope creep through ungoverned personal usage

The third failure mode is the one most organizations underestimate: ungoverned personal usage that bypasses org-level controls entirely. A developer who uses their personal Claude subscription to work on company code at home is not violating any written policy — because most organizations have not written a policy yet. But that usage represents a genuine exposure: company code processed by an AI tool under the developer's personal terms of service, with no organizational visibility, no access controls, and no audit trail.

Scope creep through ungoverned usage is the most common AI coding governance failure and the hardest to detect. It does not look like a breach. It looks like productivity.

Why blocking without an alternative fails

Security engineers who block AI tools without providing a governed path get a worse outcome than the one they were trying to prevent. The surface reasoning is correct: if no AI tool is approved, no AI tool usage can be governed. The error is assuming that blocking approved tools stops AI tool usage.

The ungoverned workaround problem — what happens when there is no approved alternative

What happens when security blocks AI tools without a governed alternative:

  Approved outcome (intended):
  -> Developers stop using AI coding tools
  -> Codebase stays within organizational visibility
  -> Security team maintains audit posture

  Actual outcome (what happens):
  -> Developers use personal ChatGPT accounts at free tier
  -> Code gets pasted directly into LLM prompts, no audit trail
  -> Personal GitHub Copilot subscriptions activated on personal cards
  -> Unmanaged Claude.ai sessions with zero organizational visibility
  -> "Shadow AI" usage with no vendor agreements, no retention policy, no access controls

  Net security position:
  -> Before block: 1 vendor under review, limited audit capability
  -> After block:  5-8 unmanaged tools, zero audit capability, zero visibility

  Blocking without an alternative does not reduce AI usage.
  It moves AI usage to somewhere ungoverned.

The empirical record on this is consistent. When engineering-led AI tool requests are blocked without a governed alternative, developers find workarounds within weeks. Those workarounds — personal accounts, free tiers, browser-based tools — create a worse security posture than a well-governed approved tool would have. The security perimeter does not hold; it moves to somewhere entirely unmonitored.

The right frame is: the security team's job is not to prevent AI usage. It is to govern it. Blocking without an alternative means ceding that governance to individual developers making their own decisions about which tools to use and how to use them. That is not a security win. That is a governance abdication dressed up as caution.

The governance architecture

A well-governed AI coding setup is defined by its data flow. The question is not which tool gets approved. The question is where code goes when a developer asks a question, and whether the organization has visibility and control over that path.

The worst data flow is: developer opens sensitive file in editor, pastes excerpt into LLM prompt, LLM API processes it, response returned to developer. The organization has no record of what was pasted, no control over the LLM API's retention policy, and no visibility into the access surface this creates.

A governed data flow separates context retrieval from code exposure. The repository connects to a managed context platform. That platform builds a semantic index — a structured representation of system behavior, not raw code dumps. Developers query the index through an MCP endpoint that returns context without requiring them to send raw code to an LLM prompt. The developer gets accurate system context. The raw code never leaves the organizational trust boundary through an unaudited channel.

Security governance checklist — what each control addresses

AI coding tool governance checklist:

  DATA RESIDENCY
  [x] Where does code go when a query is made?
  [x] In what form — raw source, tokenized, semantic index?
  [x] How long is it retained, by whom, under what policy?
  [x] Is data processed in your required geographic region?

  ACCESS CONTROLS
  [x] Which repositories can which users access through the tool?
  [x] Does access map to existing repository permissions?
  [x] Can access be revoked centrally when an employee leaves?
  [x] Are there role-based controls for sensitive repositories?

  AUDIT TRAIL
  [x] Is there a log of which index was queried, by whom, when?
  [x] Can you produce that log for an auditor on demand?
  [x] Is the log tamper-evident?

  VENDOR SECURITY POSTURE
  [x] SOC 2 Type II or equivalent certification?
  [x] Subprocessor list available and contractually bounded?
  [x] Breach notification SLA documented in agreement?
  [x] Penetration testing results available under NDA?

  INCIDENT RESPONSE
  [x] If code is exfiltrated, what is the detection mechanism?
  [x] What is the notification timeline?
  [x] What is the remediation process?

The access control architecture for a governed setup maps directly to existing repository permissions. If a developer does not have access to a repository on GitHub or GitLab, they do not get context from that repository through the managed platform. The trust surface is the same OAuth authorization the organization already operates for CI/CD. There is no second access control layer to maintain, no additional user provisioning process, and no desynchronization risk between repository permissions and AI tool permissions.

Audit trail requirements are satisfied by logging at the MCP endpoint level: which repository index was queried, by which user, at what time. This is not the same as logging every developer prompt — it is logging the context retrieval events that represent the organizational risk. That log is producible on demand for an auditor without requiring access to individual developer sessions.

The SOC2 and ISO 27001 framing

Auditors asking about AI tool usage in a SOC2 or ISO 27001 context are focused on a defined set of questions. The security engineer's job is to have answers to those questions documented before the audit, not to explain gaps during it.

Data classification and handling. Auditors want to know how the organization classifies the code that reaches AI tools, what handling requirements apply to each classification tier, and how those requirements are enforced in practice. A governed setup answers this through the repository scope configuration: certain repositories are excluded from AI context indexing based on their classification. That exclusion is documented, auditable, and technically enforced rather than relying on developer judgment.

Retention policies. The auditor wants to know how long code data is retained by AI tool vendors and under what deletion procedures. This requires a vendor agreement that specifies retention terms — not a privacy policy, but a contractual commitment with a defined term and a deletion procedure that can be invoked. Personal accounts and free tiers do not provide this. Org-level vendor agreements do.

Access controls. Who can access what, how access is provisioned, how it is deprovisioned, and whether there is a record of current access state. A governed setup where AI tool access is mediated by existing repository OAuth answers all four questions with the same answer as "how do you control repository access?" — which the organization already has documented.

Incident response. What is the procedure if an AI tool vendor experiences a breach? The auditor wants a documented procedure: detection, notification timeline, containment, and remediation. This requires a vendor agreement with defined breach notification SLAs and a documented internal procedure that maps to those SLAs.

The vendor evaluation process

Security engineers evaluating AI coding tool vendors are running a structured process regardless of whether they have written it down. The dimensions that matter are consistent across organizations.

Data residency. Where is code processed? Where is it stored? Is the processing location compliant with your organization's data residency requirements — particularly relevant for companies subject to GDPR, CCPA, or sector-specific regulations like HIPAA or PCI-DSS.

Raw code transmission. The most important technical question in the vendor evaluation is whether the tool sends raw source code to a third-party LLM API, or whether processing occurs on infrastructure the vendor controls or within the organizational trust boundary. Tools that send raw code to OpenAI, Anthropic, or other LLM APIs as part of their default operation create a different risk profile than tools that process code on vendor-controlled infrastructure or serve context without transmitting raw code.

Subprocessors. Which third parties does the vendor use to process data? The subprocessor list is a compliance requirement under GDPR and a practical requirement for any organization doing vendor security reviews. A vendor who cannot produce a current subprocessor list and a contractual commitment to notify of subprocessor changes is not ready for enterprise procurement.

Security certifications. SOC 2 Type II is the baseline. ISO 27001 is relevant for organizations subject to European compliance requirements. Penetration testing results available under NDA provide additional assurance. A vendor without at least SOC 2 Type II should not be in the final evaluation set for any organization with meaningful compliance obligations.

Breach notification SLA. What is the contractual commitment for notifying the customer of a security incident? 72 hours is the GDPR requirement for notification to supervisory authorities. Vendor agreements should include a notification SLA that allows the customer to meet their downstream notification obligations.

Kognita's security model

For security engineers evaluating Kognita against the dimensions above: the architecture is designed specifically to address the data flow problem that makes other AI coding tools difficult to govern.

Repository access is via OAuth to the organization's existing repo host — GitHub, GitLab, or Bitbucket. The same OAuth authorization that governs CI/CD access governs Kognita access. There is no separate credential set to manage, no second permission layer to maintain, and no desynchronization risk.

The semantic index is built on Kognita's infrastructure, not on individual developer laptops. When a developer queries their AI session for information about the codebase, the query reaches Kognita's MCP endpoint and returns semantic context. Raw source code is not transmitted to an LLM prompt through Kognita's query path. Developers get better context than they would from pasting raw code, while the organization avoids the raw code transmission risk that personal AI tool usage creates.

Access is scoped to the repositories the organization explicitly connects. A repository that has not been connected is not indexed and cannot be queried. Repository exclusions based on classification can be enforced at the connection level — the security team configures which repos are in scope, and that configuration is technically enforced rather than relying on developer compliance.

Writing the AI coding tool policy

A written policy is required for SOC2 and ISO 27001 compliance and is the foundation for any enforcement action if a violation occurs. The policy does not need to be long. It needs to be specific enough to answer the auditor's questions and clear enough that a developer reading it knows what they are and are not allowed to do.

AI coding tool policy template sections

AI coding tool policy — required sections:

  1. APPROVED TOOLS LIST
     - Tool name, version/tier, approved use cases
     - Repository scope: which repos are in scope for AI context
     - Prohibited uses: what code categories cannot be used with each tool

  2. DATA CLASSIFICATION REQUIREMENTS
     - Which data classifications require elevated approval before AI tool use
     - How developers identify classified data in their working context
     - Escalation path when classification is unclear

  3. ACCESS AND AUTHENTICATION
     - How access is provisioned (SSO, OAuth, central account)
     - How access is deprovisioned when an employee leaves
     - Who is responsible for maintaining the approved tools list

  4. EXCEPTION PROCESS
     - How developers request approval for tools not on the approved list
     - Review timeline and required documentation
     - Temporary vs. permanent exception criteria

  5. INCIDENT RESPONSE
     - What constitutes an AI tool security incident
     - Reporting path and timeline
     - Containment procedure for suspected code exfiltration

  6. TRAINING REQUIREMENT
     - What training is required before using approved AI tools
     - How completion is tracked
     - Renewal cadence

The exception process is one of the most important sections to get right. Developers will encounter tools not on the approved list and will need a path to get them reviewed. An exception process that takes four weeks and requires a security review from scratch effectively functions as a block — developers skip it and use the tool without authorization. An exception process with a defined timeline, a questionnaire that captures the required security information, and a documented approval path functions as governance.

The training requirement deserves more investment than most organizations give it. Developers need to understand not just which tools are approved, but what the security rationale is — specifically, why raw code pasting into personal LLM sessions is a governance risk even if the code does not look sensitive. That understanding is what makes the ungoverned workaround problem less likely. Developers who understand the security architecture are more likely to use governed tools and less likely to route around them.

Final take

The security engineer's job is not to say no to AI tools. It is to architect a yes that survives an audit. Blocking without a governed alternative produces ungoverned usage that is worse for security than the approved tool would have been. The goal is a data flow architecture where code stays within the organizational trust boundary, access is mediated by controls that map to existing permissions, and the audit trail is producible on demand.

That architecture exists. It requires choosing a context platform that separates semantic index access from raw code transmission, mapping access to existing repository permissions, and maintaining a written policy with an exception process that functions as governance rather than obstruction. The outcome is not a security team that blocked AI tools. It is a security team that was the reason the organization adopted AI tools safely — and can demonstrate that to the next auditor who asks.