AIP Podcast

AIP Podcast EP 77 - Reverse RAG and Deterministic AI Infrastructure by Formic AI

AI Partnerships Corp. Episode 77

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 17:39

This episode’s guest, Daniel Escott, CEO of Formic AI, joins host Anne to tackle one of the most urgent challenges in artificial intelligence: hallucinations and the trust gap in enterprise AI. Daniel shares his unexpected journey from federal court law clerk to CEO of an AI Company, where drafting AI guidelines for Canada’s Federal Court sparked a deeper mission to solve the reliability crisis facing generative AI across law, finance, government, and other high-stakes industries.

Daniel explains how Formic AI’s groundbreaking deterministic and observable architecture flips traditional retrieval-augmented generation (RAG) on its head. By identifying and validating source material before passing information to a language model, Formic’s reverse RAG framework eliminates fabricated citations, increases explainability, and delivers verifiable, trustworthy outputs that professionals can rely on.

The conversation also explores Formic’s innovative Explainable Language Model (XLM) architecture — a neurosymbolic, graph-based system that dramatically reduces GPU dependence and cuts energy consumption by orders of magnitude. Daniel makes the case that the future of AI must not only be safe and reliable, but also clean, ethical, and energy-efficient.

Follow AIP Affiliate, Formic AI
Website: https://www.formic.ai/
LinkedIn: https://www.linkedin.com/company/formic-ai/

Follow AI Partnerships
Website: https://www.aipartnershipscorp.com/
LinkedIn: https://www.linkedin.com/company/aipartnershipscorp/
X: https://twitter.com/AIPartnerships

The AIP Podcast is hosted by Anne Cheng, on behalf of the AI Partnerships, a Railtown company

SPEAKER_00

Years ago, before I got into the world of AI, there was one of the biggest problems that AI posed: hallucinations. Often with generative AI, even citations were made up. And the basis of some of the answers were often misrepresented because of made-up references and citations. And the world of AI scientists have been attempting to solve that ever more urgently since generative AI became a force for the world of technology. But what if someone has solved that and through deterministic and observable models can even reduce the energy consumption of AI on a chip? Good morning, afternoon, and evening to our audience, wherever you're listening in from. I'm Anne, your host of the AIP podcast. And today I have one of my most exciting guests I've ever hosted. Daniel Ascott of Former AI has had a fascinating career and somehow stumbled into the world of AI and is solving for one of the most enduring problems of AI that has ever been presented. Hallucinations. Come on, let's dive right in. Daniel, welcome to Off Show. I'm so excited to get started.

SPEAKER_01

Thank you for letting me join in. It's always good to be here.

SPEAKER_00

Thanks. Well, Daniel, let's dive right in, right? Tell us about your career, Daniel. You never had really intended to become an AI front runner or you know, um a person who's leading an AI company. So tell us, take us through your beginnings and how you stumbled into this.

SPEAKER_01

Well, uh truthfully, I was a lawyer before this. I guess technically I still am. I certainly think like one. Um I was supposed to be a tax lawyer, and I maintained that all of this was an accident, and a happy accident, mind you, but an accident nonetheless. Um after I got my license to practice law here in Ontario, um, I started clerking at the Canada's federal court. And among other interesting coincidences, I have a really nasty habit of volunteering for everything. So the court was looking for a clerk to join the court's technology committee, and it was just supposed to be a supporting role. Um, but the first meeting of the technology committee was talking about AI guidelines. Uh this was mid-2023. This was right after the first big wave with Chat GPT-4. Um, and we were trying to grapple with the regulations and consequences and risks of artificial intelligence in not just the practice of law, but the administration of justice. So the court wanted to get out ahead of the issue and try and figure out what we need to be aware of at the very least and make other people aware of. And when this all came up, uh everyone looked around the table and noticed that I was the youngest in the room by at least 10, 15 years, and the assumption was made that I must therefore know the most about AI, which at the time I suppose may have been true, but I don't know how true that was. Um but they asked me if I'd be willing to take a stab at at least a first draft of what a set of guidelines or regulations might start to look like. And as we started iterating through that process, and I started taking more and more control over that process, uh, I ended up very deeply involved in the stakeholder consultations. There was about 60 or 70 drafts before we ended up publishing the first set of guidelines. And I actually got involved uh with the way that these risks were identified and assessed. Because it wasn't just, you know, hallucinations mean making up fake cases or or they're citing to publications that don't actually exist. But what if it makes up a f a fake fact? What if it makes up uh a fake judge and everything else about its answer is correct? Even legally trained people don't have a very good eye for this kind of information when it's presented as factual, and the responses you're getting from these models are designed to be convincing. And I started making a long list of all of these risks and issues around the technology and the architecture. And by the time we had finished this process, and I finished my time at the court, uh, I came out with this almost a manifesto of everything that I knew needed to be addressed for AI and law. And I started comparing notes with colleagues I had met through that process in other sectors in finance and banking and engineering, and we all had very similar notes. So I thought, well, this I don't think these are problems that AI has with law. I think these are problems that law has with AI. And that I think started, at least for me personally, uh, this almost quest to start addressing some of these issues because I know it's it's no longer something that it is isolated to the legal profession or to court administration. It's these are societal issues. These are problems that could affect the very foundation of the 21st century system that we live in. And I think that the capacity to trust this kind of technology, given its transformative power, is the single most influential aspect of how we're going to engage with it moving forward.

SPEAKER_00

That's brilliant. You know, you came upon Formic almost um Formic AI, almost uh well, serendipitously, I'll say. So you were very impressed, but tell us why you were impressed. What is Formic AI doing that is so um impressive, really?

SPEAKER_01

Well, uh I will preface all of this in saying this was originally at my time at the court. Um I came across Formic during the stakeholder consultation phase for the first and second set of regulations from the federal court. And I was on a trip to Toronto for something at the time. I can't even quite remember what it was, but they invited me down to their office uh to come take a look at a prototype, give some feedback, see if uh if what they had put together as a prototype would even be something the legal community might be interested in. And I figured I was in town at the time and I'm knee-deep in this stuff anyway, it would be nice to see what direction the private sector is going in. So I took the meeting, and what they showed me was literally the first version, the first full working version of what they now call a reverse retrieval augmented generation framework and a paired citation system that allows us to conduct effectively RAG the way we know it today, uh, in an upside-down process, where we're not uploading an entire database to a context window and computing the response and then mapping the response back to the sources. It's identifying the sources first, identifying the information in those sources that has to go into the response, and that's the only information that's passed over to an LLM to construct a natural language response. And that ticks so many boxes for me, even just looking at, and it was a very rough version at the time, but there were there were two things that kind of sold me. The first was that because this reverse rag framework was deterministic, the citations that hyperlink the different components of the model response back to the source material are also deterministic. They cannot be hallucinated. So that immediately solves the citation issue right away. And it also meant that we could work around the context window to conduct RAG on technically an infinitely large and ever-growing database without running into the context window size limitation. And that meant that this was an incredibly fast, responsive, secure system that opened up so many more possibilities for information retrieval using artificial intelligence while almost directly addressing the flaws of traditional RAG.

SPEAKER_00

That is impressive. Let's talk a little bit more simplistic, though. Um, in terms of observability and deterministic model, let's unpack this for our audience. Why is this so important and pertinent in a world where trustable and responsible AI is becoming such a massive need?

SPEAKER_01

Well, I at first I think we need to define the problem because it it's not that LLMs as a concept are necessarily problematic. They have incredible uses, and as a as a kind of statistical fluency engine or the capacity to just create a high volume of text, they're incredibly useful. But there are certain circumstances and environments and industries like law and finance and banking and engineering and government where you can't just rely on statistical probability. You have to have some level of assurance. And a pure LLM, and even an LLM that's been grounded with the world's greatest rag, is still basically rolling the dice on every word and every citation. So when we consider the use cases of these industries, they basically amount to some level of information retrieval and something, and data analysis and drafting and reasoning, but information retrieval is always the first step in those processes because these are professional sectors that rely on decades and decades of learned experience and documentation so that we know that the work product that goes out into the world and has real world consequences is backed up with some level of evidence and expertise. So when we move to what the solution could be, and I'm not even suggesting that ForMic is itself the only solution here, the architecture becomes the problem. The combination of the transformer-based LLM architecture and retrieval augmented generation means that we have to accept some level of risk that these kinds of sectors just can't tolerate. They're not built to tolerate it, and that doesn't mean that they're going to adopt these AI systems and grid and bear it for the next 20 years. More likely it means that they just won't convert those pilots into long-term agreements. And we're starting to see that bear out in the industry. So we had to find some way of building truth and reliability and trust into this technology. And that's where we get into determinism. And when it comes to the creation of natural language responses, LLMs and the transformer-based architecture are, I think, the right solution for most use cases. But the information retrieval and retrieval augmented generation as it's currently done doesn't support that problem. As a matter of fact, in many ways, especially on models that are trained on massive data sets, it becomes much more problematic. We need to be able to retrieve the information that these professionals rely on so that they can take that information that's been retrieved and do the and-then part with an LLM. But getting to the and-then part is meaningless if the information that you've retrieved to generate the responses is fake. Or at the very least, it's so unreliable that the user has to become this kind of expert-level forensic auditor and try to figure out what's going on. Because again, that that's just more time being spent de-risking the tool than would have been spent doing the actual work if they just didn't use an LLM in the first place. So these are the kinds of deterministic supplements to LLM architecture and the AI ecosystem at large that need to be built in to open up the kinds of possibilities for AI that will make it more trustworthy and make it more reliable so that we're not just playing in this little sandbox where it's great at writing an essay as long as it's not being graded.

SPEAKER_00

Absolutely. Well, Daniel, just one for the road right now. I I want to focus on the word architecture. You you mentioned if you're successful at creating the right architecture, we can save global consumption of energy due to um cut uh well, being consumed by the data centers and GPUs. Could you tell our audience how that works?

SPEAKER_01

Absolutely. So there's this kind of GPU bottleneck in the industry, and and we'll put aside the bubble around the pricing of GPUs. Uh, but from an energy perspective, an environmental perspective, traditional transformer-based architectures already require massive, energy hungry GPU clusters to run inference and calculations. And that alone is expensive and environmentally damaging. When you compound that with the process of retrieval, augmented generation, you're not just running everything through the transformer, you're you're augmenting that by uh by orders of magnitude to try and get a more accurate or a more reliable answer and grounding it. The consequence of that is anything in these sectors that require retrieval, augmented generation to get to the and then part of that process are significantly more damaging to the environment and significantly more expensive to operate. But if we can find a way to eliminate the GPU from that process, even if it's only at the retrieval, augmented generation stage, then we can reduce the impact of AI on the environment and its consumption of electricity by several orders of magnitude. The XLM or explainable language model architecture that Formica's developed and that we're starting to deploy out into the world with our partners now is a neurosymbolic graph traversal system. It's a combination of several proprietary systems that have been developed over the course of the last four and a half years. But the consequence of being able to run all of these graph traversals in parallel as opposed to sequential tasks is that not only do we not require a GPU for any of our information retrieval, but it's actually more efficient to do it on just a CPU. I mean, there's a there's some specification requirements in the you know it needs to be on multiple multi-threaded cores and it needs to have a sufficient amount of memory. But even with all of that in mind, if we spec'd out a fantastic CPU-based system to handle the XLM architecture, it would still consume barely 5% of the electricity that a GPU heavy system would. And that scales out exponentially the more that we integrate XLM with natural language response generation, grounding, and high volume information retrieval. So we we want to position this as not only an opportunity to build trust from a results perspective, but I think we have an obligation in this industry to build trust with the public, because the public is going to dictate the way that AI adoption takes course over the next 15 to 20 years. And if they have some inherent belief, and it's not wrong, that en masse AI adoption and deployment and building these giga AI factories that are the size of cities and are consuming the electricity of entire cities just to be able to handle whether ChatGPT is doing enough deep research for your PhD dissertation. I don't think we're earning that trust. So it's not just about safe AI, it has to be clean AI, it has to be ethical AI. And that's part of this trust building process for us.

SPEAKER_00

That's wonderful. Well, Danielle, I've truly, truly enjoyed our time on this episode. Unfortunately, that's all the time we have for today. To our audience, well, I hope you have enjoyed yourself just as much as I have. If you have benefited from this episode, please do like, share, and follow the series so we can bring ever greater guests and insights into the development of AI into your inbox. Once again, my name is Anne Cheng, and on behalf of the AIP podcast, I am signing off.