For the last several years we’ve heard that artificial intelligence (AI) is here and it’s changing the world. Yet look around—what’s changed?
Sure, chatbots galore are ready to answer a question or two without your sitting on hold. And yes, dozens of tweets and LinkedIn posts are clearly penned by writers who have never eaten a hot dog or seen a sunset. But what about AI applications that fundamentally change the workflow and the productivity of your company? Those are harder to find.
Which is why AI for research and development (R&D) tax credits is such an exciting prospect. It’s here. Like, right now. And it completely upends the old time-consuming expensive way that R&D credits were filed.
AI Cuts Through the Labor of Supporting R&D Tax Credits
Ask any controller or head of tax: the old way is exceedingly burdensome. The three key challenges in the arena of R&D tax credits are: 1) determining what qualifies; 2) determining how much time people spent on R&D; and 3) writing narratives that are technically accurate yet tax-specific. The core issue for controllers and heads of tax is that very few R&D teams track their work in ways that easily map onto the Internal Revenue Service’s requirements.
To solve for the first two challenges, many R&D teams must fill out timesheets or surveys, follow a tedious tagging system, sit for interviews, or all of the above. This is bad for productivity and, even with a structured tagging system in place, data entered by scores of people months after the work is done guarantees inconsistencies and errors. More than that, engineering teams at innovative companies have lots of turnover—with every new hire, the R&D tagging process must be retaught and remastered.
Finally, and perhaps most important, engineers are human. That means it takes many follow-ups to ensure that they’re actually following the R&D tagging process. The hours spent tagging timesheets is a waste of your engineers’ valuable time; the hours spent training, cajoling, emailing, circling back, following up, and chasing down missing data wastes your tax filer’s time (not to mention the effect on their sanity).
The fundamental shift with AI is that it can solve the three challenges of the R&D credit without a highly structured process and without burdening the R&D team. AI can ingest an R&D team’s existing data and systematically identify what projects qualify for the credit according to the IRS’ four-part test. AI can determine how much of an R&D team member’s time qualifies by calculating time spent on that qualified work from the metadata in their existing data without the need for timesheets or interviews. And last, AI can write technical and IRS-friendly narratives by summarizing the detailed R&D team data.
AI Practically Made for Tax
So, why does AI work so well for the R&D credit where other AI has failed? The easiest way to understand is to take a step back to what AI—or, more specifically, a large language model (LLM)—really is. AI, at its simplest, can be understood as computers exhibiting human-like intelligence. But the AI that’s gotten the market so excited is more specifically generative AI (gen AI), when a computer can “learn” and then create its own original content based on the rules given to it.
An LLM is a subset of gen AI that specifically “learns” and creates language-based outputs. Think: ChatGPT. OpenAI trained its LLM on hundreds of billions of words until its own gen AI could understand the stylistic and grammatical rules of language. Now, users just enter a prompt, and ChatGPT can deliver a sentence or a paragraph or a novel.
The challenge for OpenAI (and the reason ChatGPT was viewed as such a monumental step in computer intelligence) was that language is a decidedly difficult puzzle. Ask any nonnative English speaker, for example. Grammar rules seem haphazard, with many exceptions. Usage changes generationally. Even spelling changes based on the location of the English speaker. Language is a hurdle that OpenAI, incredibly, has nearly cleared. But you know what computers were built for and what they’ve always been great at? Numbers. Which is why the hurdle for AI to master tax is actually much lower.
The tax code couldn’t be more perfectly suited for AI. It’s long, jargon-heavy, and nearly impossible for a layperson to parse. That’s why almost every business leader just throws up their hands and passes off the tax function to a specialist. But for an LLM, the specificity and length of the tax code is a boon, not a burden. It gives an LLM lots of rules to train on. It lets the AI have an answer for every bit of minutiae it might come across. Large language models love large bodies of text.
So, an LLM trained on the tax code can be trained to become expert at the rules of a state or federal tax law. But then what? That’s where things get exciting.
Because so many businesses have moved their project management and payroll online with software like Jira, Rippling, and Workday, they have at their hands a dataset that an LLM can quickly ingest. To calculate the amount of R&D spending a company does in a year, most accounting firms pore over payroll data and then do interviews with engineers and managers to identify work that qualifies as R&D for the credit. But an LLM can search through troves of data—reading more than a million tickets in a way no human filer could—while flagging business components that qualify based on the four-part test and pinning each qualified business component to a specific engineer in the payroll data. A well-trained LLM can also identify duplicate entries, calculate the exact percentage of a certain engineer’s work that qualifies, and finally create both an R&D credit filing and a comprehensive study that can be presented to the IRS in the event of an audit.
But can AI really be trusted with something as high-stakes as taxes? A better question is, Can we trust a process as antiquated as one that relies on the memories of engineers and managers months after the work was done?
AI Provides Real-Time Documentation
The IRS has signaled over the last two years that it has moved sharply toward requiring contemporaneous data. In September 2023, a proposed change to Form 6765 (Credit for Increasing Research Activities) asked businesses to provide granular information about each qualified business component. In July 2024, the government asked for a summary judgment denying Kyocera AVX, the multinational ceramics and electronics manufacturer, a $1.3 million amended Section 41 R&D credit lookback prepared by PricewaterhouseCooper. In its objection, the IRS wrote: “The PwC study exclusively relied on interviews to determine employees’ time spent on projects; it did not use documentation.”
At a moment when many innovative companies use software where project data is generated passively during employees’ natural workflows, the availability of contemporaneous data has never been greater. But clearly that greater availability leads to a higher bar that the IRS expects filers to clear. In this new, data-rich reality, some unlucky associate at ABC accounting firm would need to read through all the millions of pages of contemporaneous documentation, parse it, weigh it against IRS codes and regulations and court cases and memorandums and letters and guidance, and then create an R&D tax credit filing and accompanying study that stands up to the IRS’ higher standards. Which is why an LLM solution is so perfect.
The data is there, but analyzing it manually is expensive and time consuming. More than that, a human parsing such a giant dataset creates nearly endless opportunities for human error. AI probably won’t write the Great American Novel. It certainly can’t replicate the intangibles of a great leader that helps startups become game-changing companies. But it can do business taxes. And it can do them really, really well.
Ahmad Ibrahim is cofounder and CEO of Neo.Tax.