{"id":1843,"date":"2025-02-10T09:00:00","date_gmt":"2025-02-10T09:00:00","guid":{"rendered":"https:\/\/cybersecurityinfocus.com\/?p=1843"},"modified":"2025-02-10T09:00:00","modified_gmt":"2025-02-10T09:00:00","slug":"nearly-10-of-employee-gen-ai-prompts-include-sensitive-data","status":"publish","type":"post","link":"https:\/\/cybersecurityinfocus.com\/?p=1843","title":{"rendered":"Nearly 10% of employee gen AI prompts include sensitive data"},"content":{"rendered":"<div>\n<div class=\"grid grid--cols-10@md grid--cols-8@lg article-column\">\n<div class=\"col-12 col-10@md col-6@lg col-start-3@lg\">\n<div class=\"article-column__content\">\n<div class=\"container\"><\/div>\n<p>Gen AI data leaks from employees are an enterprise nightmare in the making.<\/p>\n<p>According to a <a href=\"https:\/\/www.harmonic.security\/resources\/from-payrolls-to-patents-the-spectrum-of-data-leaked-into-genai\">recent report on gen AI data leakage<\/a> from Harmonic, 8.5% of employee prompts to popular LLMs included sensitive data, presenting security, compliance, privacy, and legal concerns.\u00a0<\/p>\n<p>Harmonic, which analyzed tens of thousands of prompts to ChatGPT, Copilot, Gemini, Claude, and Perplexity during Q4 2024, found that customer data, including billing information and authentication data, accounted for the largest share of leaked data at 46%. Here, Harmonic highlighted insurance claims as a type of report rife with customer data that is frequently entered into gen AI tools by employees to save time in processing.<\/p>\n<p>Employee data, including payroll data and personally identifiable information (PII), accounted for 27% of sensitive prompts, followed by legal and finance data at 15%.<\/p>\n<p>\u201cSecurity-related information, comprising 6.88% of sensitive prompts, is particularly concerning,\u201d according to the report. \u201cExamples include penetration test results, network configurations, and incident reports. Such data could provide attackers with a blueprint for exploiting vulnerabilities.\u201d<\/p>\n<h2 class=\"wp-block-heading\">Out from the shadows<\/h2>\n<p>Generative AI data leakage is a challenging problem \u2014\u00a0and a key reason why enterprise <a href=\"https:\/\/www.csoonline.com\/article\/3801012\/gen-ai-strategies-put-cisos-in-a-stressful-bind.html\">gen AI strategies are putting CISOs in a stressful bind<\/a>.<\/p>\n<p>Enterprise LLM use falls into three broad categories:sanctioneddeployments, including licensed and in-house developed implementations; shadow AI, typically comprising free consumer-grade apps forbidden by the enterprise <a href=\"https:\/\/www.csoonline.com\/article\/2138447\/unauthorized-ai-is-eating-your-company-data-thanks-to-your-employees.html\">for good reason<\/a>; and semi-shadow gen AI.<\/p>\n<p><a href=\"https:\/\/www.csoonline.com\/article\/2138447\/unauthorized-ai-is-eating-your-company-data-thanks-to-your-employees.html\">Unauthorized shadow AI is a primary issue for CISOs<\/a>, but this last category is a growing problem that may be the hardest to control. Initiated by business unit chiefs, semi-shadow AI can include paid gen AI apps that have not received IT approval, enlisted for experimentation, expediency, or productivity enhancement. In such instances, the executive may be engaging in shadow IT while line-of-business employees are not, having been told to make use of the tools by management as part of its AI strategy.<\/p>\n<p>Shadow or semi-shadow, free generative AI apps are the most problematic, as their license terms usually allow for training on every query. According to Harmonic\u2019s research, free-tier AI use commands the lion\u2019s share of sensitive data leakage. For example, 54% of sensitive prompts were entered on ChatGPT\u2019s free tier.<\/p>\n<p>But most data specialists also discourage CISOs from trusting contractual promises of paid gen AI apps, most of which prohibit training on user queries in enterprise versions.<\/p>\n<p>Robert Taylor, an attorney with the Carstens, Allen &amp; Gourley intellectual property law firm, gives the example of trade secrets. Various legal protections \u2014 especially trade secret protections \u2014 can be lost if an employee asks a generative AI system a question that reveals the trade secret, he said, adding that lawyers protecting IP often have team members ask questions of a wide range of AI apps about trade secrets to see whether prohibited data is discovered. If so, then they know someone leaked it.<\/p>\n<p>If a competitor learns of the leak, it can argue in court that the leak invalidates the trade secret\u2019s legal protections. According to Taylor, the IP owner\u2019s lawyers must then prove the enterprise deployed a wide range of mechanisms to protect the secret. Relying on the provisions of a contract that promises no training on generative AI queries \u201cis not a sufficient level of reasonable effort,\u201d Taylor said.<\/p>\n<p>\u201cIt would be a totality-of-circumstances situation,\u201d he said. Enterprises must deploy and strictly enforce \u201cpolicies that constrain your employees on use of that data.\u201d<\/p>\n<h2 class=\"wp-block-heading\">Data-conscious practices<\/h2>\n<p>CISOs should work with business leaders to ensure employees are trained on ways to get the same results from LLMs without using protected data, said Jeff Pollard, a VP and principal analyst at Forrester. Doing so requires more finesse with prompts, but it protects sensitive information without diluting the effectiveness of the AI\u2019s generated answer.<\/p>\n<p>\u201cYou really don\u2019t have to reveal sensitive information in order to get a positive benefit out of the system, but we do have to train users to understand query phrasing\u201d strategies, Pollard said.\u00a0<\/p>\n<p>When it comes to employee use of free AI tools rather than locked-down corporate-paid apps, \u201ccracking down on employees is the most obvious thing to do, but the core question is: \u2018Why are employees doing it?\u2019\u201d asked Arun Chandrasekaran, a distinguished VP and analyst at Gartner.<\/p>\n<p>\u201cEmployees are doing it because IT is not providing them the tools they need,\u201d he argued.<\/p>\n<p>CISOs should point this out to their C-suite counterparts to help enforce that enterprise-wide AI tools should be \u201ctruly usable,\u201d he said.<\/p>\n<p>Unfortunately, with generative AI, the genie is already out of the bottle, according to Kaz Hassan, senior community and partner marketing manager at software vendor Unily\ufeff.<\/p>\n<p>\u201cAI use by employees has outrun the ability of IT teams to catch up,\u201d he said. \u201cIT teams know the situation isn\u2019t great but aren\u2019t able to crack the comms, culture, or strategy part of the equation to make an impact.\u201d<\/p>\n<p>Hassan added: \u201cA new blueprint is needed, and organizations need clear AI strategies now to reduce risk, and they need to follow up with AI woven into the employee tech stack imminently.\u201d<\/p>\n<p>Typical monitoring and control apps miss the point of the data leakage, he claimed.\u00a0<\/p>\n<p>\u201cPower users are processing sensitive data through unauthorized AI tools not because they can\u2019t be controlled, but because they won\u2019t be slowed down. The old playbook of restrict-and-protect isn\u2019t just failing \u2014 it\u2019s actively pushing AI innovation into the shadows,\u201d Hassan said. \u201cCISOs need to face this reality: either lead the AI transformation or watch their security perimeter dissolve.\u201d<\/p>\n<p>Hassan pointed out that the data problem from generative AI goes in two directions: sensitive data leaving via queries, and flawed data \u2014 either via hallucinations or having been trained on incorrect information \u2014 coming in to the enterprise via generative AI answers that your team relies on for corporate analysis.<\/p>\n<p>\u201cToday\u2019s CISOs shouldn\u2019t just worry about sensitive data getting out,\u201d Hassan said. \u201cThey should also be concerned about bad data getting in.\u201d<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>","protected":false},"excerpt":{"rendered":"<p>Gen AI data leaks from employees are an enterprise nightmare in the making. According to a recent report on gen AI data leakage from Harmonic, 8.5% of employee prompts to popular LLMs included sensitive data, presenting security, compliance, privacy, and legal concerns.\u00a0 Harmonic, which analyzed tens of thousands of prompts to ChatGPT, Copilot, Gemini, Claude, [&hellip;]<\/p>\n","protected":false},"author":0,"featured_media":1844,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3],"tags":[],"class_list":["post-1843","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-education"],"_links":{"self":[{"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=\/wp\/v2\/posts\/1843"}],"collection":[{"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1843"}],"version-history":[{"count":0,"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=\/wp\/v2\/posts\/1843\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=\/wp\/v2\/media\/1844"}],"wp:attachment":[{"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1843"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1843"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1843"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}