Gray Matters
Mission Matters Podcast
🎙️Ep 2 - Autonomous Cyber: Building the AI Platform for Offensive Cybersecurity
0:00
-56:45

🎙️Ep 2 - Autonomous Cyber: Building the AI Platform for Offensive Cybersecurity

In this episode of the Mission Matters podcast, Akhil and I speak with Patrick O’Brien and Bohdan Volyanyuk, the founders of Autonomous Cyber, a young startup building the future of AI for offensive cybersecurity operations.

In this episode with Autonomous Cyber we cover:

  • How generative AI is reshaping the offensive cyber kill chain

  • The role the fifth domain plays in modern warfare

  • Why augmentation beats automation for cyber operators

  • Building trust and feedback loops with national security customers

  • What it takes to go from zero to validated product in a sensitive mission space

  • And much more

You can listen to the podcast on Spotify, Apple, the Shield Capital website, or right here on Substack.

As always, please let us know your thoughts, and please let us know if you or anyone you know is building at the intersection of national security and commercial technology. And Autonomous Cyber is hiring! Please reach out to their team if you are interested in working in the future of offensive cybersecurity. Tune in next month for our next episode!


Here is the full episode transcript:

Patrick 00:00

I think, for people in the information security space, or especially offensive cybersecurity, everyone in the world saw LLMs and thought the exact same thing. This is a hack bot. Every InfoSec professional that I know is using LLMs in one way or another, and folks are finding ways to speed up their workflows, finding ways to get to information first, whether that's a vulnerability, whether that's, you know, going to a particular place in the world, the people who are starting to succeed are the ones that are using LLMs. And what we're building at Autonomous Cyber is a platform to do that most effectively.

Maggie 00:38

Welcome to the Mission Matters Podcast, a podcast from Shield Capital where we explore the intersection of technology startups and national security. I'm Maggie Gray, and

Akhil 00:47

I'm Akhil Iyer

Maggie 00:48

And we are your hosts from the investment team at Shield Capital. In this podcast, we discuss the technical challenges of developing and deploying commercial technology to national security customers, as told from the founder’s perspective. In this episode, we're joined by the founders of Autonomous Cyber, a young startup building an AI native platform for the future of cyber operations. Cyberspace is often referred to as the fifth domain of warfare, right up there with land, sea, air and space, and for good reason, it's been a serious tool in the national security toolkit for a while now. For example, take the Stuxnet attack back in 2010 in which hackers were able to disrupt Iran's nuclear centrifuges and inflicted significant damage on Iran's nuclear program. These days, it feels like there's a new headline about a cyber breach almost daily. Just this past September, for instance, Chinese hackers were able to access the communications of senior US officials through what's now known as the Salt Typhoon attack.

Akhil 02:02

The bottom line is that both our everyday lives and modern warfare are deeply tied to digital systems, and those systems are vulnerable. And despite many technological advancements, cyber organizations and operations remain relatively manual. It takes teams of human experts to map out attack paths and defend systems effectively, but now with large language models, especially ones built for reasoning, we're seeing a shift. Since GPT 3.5 came out in late 2022 organizations have been using LLMs for everything from threat detection to automating security operations centers to finding vulnerabilities and even running penetration tests.

Maggie 02:40

Autonomous Cyber CEO, Patrick O'Brien knows this world inside and out. He spent seven years at the National Security Agency, NSA, the government's top Cyber Security Agency, before linking up with his co-founder, former Army officer Bohdan Volyanyuk. They met in business school and teamed up to bring the power of generative AI to cyber operations for the Pentagon and private sector alike. Patrick and Bohdan, thank you so much for joining us. So to start out, can you guys tell us a little bit about the origin story for Autonomous Cyber and little bit about what you guys actually do?

Patrick 03:15

Yeah, Maggie, thanks so much. We're super excited to be here and super excited to be partnering with shield to build this company. So the origin for Autonomous Cyber, the analogy I like to tell is in the Oppenheimer movie. At the beginning of the movie, there's this moment where this paper comes out, but I believe it's like the splitting of the uranium nucleus, and Oppenheimer and Ernest Lawrence basically say every physicist in the world saw that paper and thought the exact same thing. I think for people in the information security space, or especially offensive cybersecurity. Everyone in the world saw LLMs and thought the exact same thing. This is a hack bot. And I think that we saw that back in 2022 when we really started working on this, like pre ChatGPT and all that. But this was a really big opportunity to address some critical USG mission needs in the fifth domain, and have been working on it since.

Bohdan 04:06

Hi Maggie, thank you for having us. My background is not as spooky as Patrick's. Actually originally came to America as immigrant from Ukraine. My dad won a green card through the National Lottery when I was six years old and immigrated myself. My mom over to the US, ended up majoring in mechanical engineering at NC State, and then took a slight detour when I joined the US Army. Did the whole Airborne School, Ranger School jump master thing, deployed Iraq and Syria as infantry officer with 82nd airborne did some non combat work in Europe with NATO. Got out a couple years ago, and very fortunate to get into MIT's two year MBA program where I met Patrick. And MIT has a long history of supporting the national defense mission, and I got sucked right back into it, into this landscape, by being the president of the Defense Tech Club here and running the Technology National Security Conference here, which we're actually attending today and tomorrow. So we're excited about that. And I think a pivotal moment for me was also interning at Vannevar Labs during this time. They had just raised their Series B at this point, and I had the opportunity to, first hand deliver some cutting edge Silicon Valley tech to do the end users, both a state side and abroad. And this really opened my eyes to how, in the wake of companies like Anduril and Vannevar Labs, that there is a pathway for new entrants to enter the defense market to deliver this technology that provides critical mission wins without solely having to compete with primes and huge programs of records immediately.

Maggie 05:25

I remember the first time back in November 2022 when I started playing with ChatGPT. I'm pretty sure I used it to try and create some poetry. I think it helped me maybe with a few of my essays in school. I definitely did not think about using it for penetration testing or any kind of cyber operations. So what was the moment early on when you guys were experimenting with this technology, when you thought yourself okay, like this actually works for this use case?

Patrick 05:53

Yeah, it's funny. I you know, I think for us, it really was right away. I remember, like, we hooked up. I think we were building on top of InstructGPT at the time, something like super, super basic. And we put it on a lab and hacked the box. And I think it was just like a very easy box thing was called fawn, and we just connected to an FTP server, it authenticated and it logged in. Just started with, like a simple prompt of like, hey, go do this. And I think just even from that moment in time, it was just so obvious to us that this was a completely different way of compute working, and that there were going to be opportunities to really change how cyber operations work. So the way we think about cyber operations in the kill chain broadly is we think about, okay, there's vulnerability discovery, exploit development, and then operationalization. And it was just clear to us then that we could build tooling that could, you know, upskill humans and make us more effective at each individual part of that kill chain. And it's something that we've done since then. You know, we reported a couple CVEs back in December, and you know, that's December, so we have have some cool updates since then, too. And you know, I think it's really been validated in the past couple years that that approach works. But, you know, that spark, I think, has been there from the beginning, and it's been really fun to be building on top of this,

Maggie 07:06

For folks who aren't deep in the weeds of penetration testing, which I imagine is probably most of the people listening to this podcast. Can you just tell us, what does penetration testing actually look like? What did it look like before people had tools like Autonomous Cyber, and what's different now?

Patrick 07:25

Yeah, one of the hard things about this industry is that, like, every word means everything, and it means different things to different people. So it's like, even the fact we have, like, cyber is in the company name. But a lot of people who are good at this domain will just like, cringe the fact that even to call it cyber, right, they call it information security. I remember and like, and there's some interesting stuff to talk about here, too, in terms of, like, bridging cultures with especially like US Department of Defense, where, like, I remember at one point someone talked to me about, like, navigating the net. I was like, what are you what are you talking about? So, you know, when we talk about, like, penetration testing, that means a bunch of different things to a bunch of different people and and broadly, that's kind of why we focus on the kill chain. So we talk about, like, okay, for folks who are finding a vulnerability, for folks who are and maybe just like patching that right, like for folks who are trying to do reverse engineering, looking at often compiled programs, for folks who are inside of a network, actually, like doing network operations that maybe they're at a red team somewhere commercially, or trying to get into a network from the outside, maybe doing something that other people would call, like web application security. I mean, there's 1000 names for all this, but by and large, it's just really hard, right? Like we used to say that it's just like cyber is hard, and it's something that requires a very deep skill set in a narrow domain to be actually, effectively good as a penetration tester, whether you're working at XYZ commercial firm or wherever else, you know, for us, I think I really love the quote. You know that the future is already here. It's just not evenly distributed. And you know that that quotes been batted around a lot recently with LLMs, but I think it's it's particularly true in this domain, where every InfoSec professional that I know is using LLMs in one way or another, and folks are finding ways to speed up their workflows, finding ways to get to information first, whether that's a vulnerability, whether that's going to a particular place in the world, the people who are starting to succeed are the ones that are using LLMs, and what we're building at Autonomous Cyber is a platform to do that most effectively. So I think that we're already seeing those changes. I think anyone who's a capture the flag player, which is sort of like a sort of like a gamified sort of cybersecurity competition, I don't think you're performing well in those right now without the aid of LLMs. And so I think that, like, it's already here, it's happening, and we're building a platform to take us to the next step.

Maggie

You guys are building a true dual use technology. How does the US government use your product, and how does that compare to how commercial customers use it?

Bohdan 09:48

Yeah, Maggie, the way for the DoD customer, the way we kind of break down the cyber kill chain is vulnerability research, expert organization and penetration testing, which also can be operationalization, has a lot of other buckets within it, like analysis and like actual red teaming. But the tagline that we're kind of using is force multiply for the cyber warrior. So anywhere within that kill chain that DOD operators, whether they're developers or actual OCO operatives, we are building technology to upskill them and augment them in any facet that we can.

Akhil 10:18

And, Bo, what about the commercial piece to Maggie's point, and we've been discussing this a lot here at Shield Capital, dual use can mean a lot of things to a lot of people. When we think about it, we think about it as how do you leverage both sectors at the right place and time to enhance and build momentum in the types of products you're building? And that might come sequentially, that might come simultaneously, or in some other fashion, curious how that looks like, because I think a lot of folks in this space from the cyber domain are actually looking at it almost purely commercial, in large part, probably because the adoption curve within the enterprise space is probably a little bit faster.

Patrick 10:57

Yeah, Akhil, I mean, so many thoughts on this, right? So I think point number one is that if you just look at our ideal customer profile, that's the offensive cyber security like, it's the offensive cyber professional, wherever they sit in that kill chain. And that individual is actually a very similar person, whether that we're working at, again, I don't want to name a specific commercial firm, because we want to work with all of them, like, whether they're working at specific commercial firm or inside the government. And as we were building this, you know, in various iterations of this company, we've been defense only. We've looked at dual use. We've been back to defense only. And I think that, like, where we really have conviction now about going through that on the dual use side is, like, fundamentally, we're just building for the same end user and the actions they're taking, and the tech is the same. And we've received pressure from, like, other investors, really, that we've talked to to be defense only, and it just doesn't make sense in this domain. It's just you have to do something that truly makes sense for what you're building. And I know that too from being on the inside, like none of the good in this particular domain. That's something that's different about cyber. As the fifth domain, right is, like, none of the good tech is government only, right? Like you it's, it's, it's something that really exists out in the real world, in the commercial world, in a way, it's much more prominently than many of the other domains do, right? So, like, you're not gonna have, I mean, maybe there's one out there, but like, one of our best friends says, like, you're not gonna have a dual use hypersonics company. Like, maybe you are, I don't know. I don't wanna, I don't wanna, like, you know, get into the spat with those people because they have hypersonics. But, you know, it's like cyber though, like information security. It's like so much of the best resources for this, so much of the best customer feedback is out there on the commercial side. So I think that, like, you're really doing a disservice. And I think the tools that are government only are looked at very skeptically by the government customer. So I think that, like, if you're coming at this from, you know, maybe a pure defense tech perspective, a pure hard tech perspective, I understand when people say, oh, you should be defense only. But for us, you know, it's, it's just makes so much more sense, and it has to be, in some ways, commercial first. Like, our feedback loops are so much tighter with our commercial pilots right now, and it's something we really appreciate. And again, like, the reason why we exist as a company is to help the United States succeed in the fifth domain. But we're just getting such good feedback on the commercial side, so it's something for us that I think has really been validated as the correct decision.

Maggie 13:08

When you guys are talking with your customers today, you mentioned these are people that are mostly using command line type tools, mix of experts and non experts. What are the biggest challenges that they are facing today, and how do you guys see Autonomous Cyber fitting in to really address those challenges?

Patrick 13:29

Staffing. It's staffing, right? Like, we know we had a one of the pen testing firms we talked with, they were like, it's a pen testing unit within a larger firm. And they told us, like, hey, we have a team of eight. We wish we had 24 we just can't meet demand, right? So staffing and scale, and it comes back to, like, a multi agents, like, Okay, you have, you can get your junior users who are, like, operating at a higher level faster, right? So they can do more client work. And you can take your expert users and take them and scale them across more clients, right? So I think scale and staffing is the number one thing we hear, and that's mirrored on the defense side as well, but especially on the commercial side, scale and staffing.

Maggie 14:04

What's the biggest aha moment when customers see your product?

Patrick 14:08

You know, we had this one moment where it was small, and I think, you know, this was, this was a year ago now, but basically, we're dealing with 200,000 line code base, and our tool basically triaged immediately. It's like, here's the four places to look, right? And that's that was, I think we got a literal audible Wow from from the customer we were working with at that point. And, you know, save them however much work. I think more of the WoW and aha moments that we're going to get soon are going to be on explainability of what's happening in sort of complex systems. I think the speed with which our model can go in and tell you is like, hey, this function does this. This function does that. Here's how they're all related. Here's where I would look. Again, it comes back to like the future is here. It's just unevenly distributed, even when people are doing very, very complex things in the kill chain that are well beyond what an AI can do today. I. So having the LLM quickly orient you and the LLM with these interfaces that we discussed, having the LLM quickly orient you towards, hey, here's how this function works. Here's the broader role of this. I mean, that saves you 30 minutes from reversing that thing or doing whatever else. So I think that it's a lot of small wows. And I think that's how you win mores,

Maggie 15:21

Patrick, so you keep using this term fifth domain in warfare, and I know that you this is something that you are actually an expert on. So can you just tell us take a step back and tell us a little bit more about the role of the fifth domain in modern conflict, and maybe a little bit about kind of the current state of US government competency in this domain and the challenges that still exist for the US, government and allies.

Patrick 15:44

A lot of the framing on this comes from a talk at Black Hat. I think it's by the Grack. I think it's like in 2018 black hat, Asia or something, where you have this concept of, okay, you have the first domain of land warfare, second domain of sea warfare, third of air, fourth of space, and the fifth, this new dimension is cyber and it's interesting insofar as it intersects with the other dimensions. And what I mean by that is you're going to fail if you try to use analogies of land mass and analogies that work in these these physical domains of land or sea or whatever like. The interesting thing about this domain is that you can pop up in different places. You can pop up at one place in another dimension, and pop up in another place at another dimension. So I think that, like, I think broadly, there's under appreciation for how ubiquitous compute is in our life and networking is in modern life. And I think that if you just go back to like his like, just fundamentals of theory and history of war, it should be completely unsurprising that those fundamentals are going to be absolutely they're going to absolutely determine outcomes of future wars. So I think that for us, the way we think about that is this is an incredibly, incredibly important domain, and it's something where it's as important as air, and it's something that is new, and it's something that's going to play a major, major role in future conflicts for the United States. And that's why we exist as a company, is to make sure that we're well positioned for that.

Bohdan 17:16

Now people are starting to realize just how much of an impact cyber has on the other four domains of warfare, just like Patrick mentioned, there's a great 2020, task and purpose article that talks about how in 2001 90% of SOCOM mission sets were kinetic by 2020, over 60% of SOCOM mission operations are in the information domain. Former SOCOM Commander General, now retired, Richard Clark had a great quote in the article, and he said, We need coders, and that the most important person on the mission is no longer the operator kicking down the door, but the cyber operator who the team has to actually get to the environment so he or she can work their cyber tools into the fight. And that's where the heart of the company lies. It's not just about building a custom solution for a specific cyber unit, but to build tooling so that that Green Beret, or that infantry sergeant, that v tip into cyber who had a six month training course and is now being told, Hey, you have to go deliver a cyber effect. Now it's really about upscaling the broader DoD base that touches cyber and act like the force multiply for that cyber warrior.

Maggie 18:14

I know one of the things that people talk about when thinking about autonomous kinetic weapons is that at some point in the future, we may need to take a human out of the loop, because our adversary is well and in order to actually keep up with the pace of warfare, it needs to be robot versus robot. Do you think that we're headed towards that in the fifth domain?

Patrick 18:34

No, I don't. I think like it just from the like, I'll just put this in the context of like, trying to do a pen test on, like, just entirely on the commercial side. I mean, it's just, it's so, so helpful to be on the loop right now as a human. I think a lot of that, even when we were building on this on top of instruct GPT back in 2022 like we talked about earlier, we're just logging into an FTP server. I mean, all that, it's always been helpful to, you know, get our tool out of loops, to poke it in the right place, to tell it what to look at. I, you know, and we've had this argument with a number of folks, and some big names in the space and big thinkers who disagree with me. But I just don't see how you could. There's another really good Walter Isaacson book is called The Innovators, and it's sort of about the history of compute in the digital age. And part of the argument that Isaacson makes in this book is that human AI teams have always outperformed raw autonomy. And again, I just don't think that fundamental incentive structure, that that concept is going to change as we keep going forward, so our bet as a company is that, you know, human AI teams will outperform AI alone. We'll see if we're right about that.

Akhil 19:49

Hey, Patrick. Let me come back to the very top discussion. Love the Oppenheimer reference. Like Maggie was mentioning, I think we all had this moment. November 2022, You certainly us as a firm. You know, half our firm came from the cyber world. Raj selling his first company to Palo, two networks, Mike, running Symantec. The rest of us looking at the space pretty deeply. And I think all of us had this sense that there was going to be a lot of fear that generate that Gen AI would supercharge cyber attacks. Are we actually seeing that happen? Has it been more hyper vs. reality? Curious on your thoughts?

Patrick 20:26

I mean, I think from our perspective, like, just what we're building, I mean, it's absolutely real, right? Like, I mean, we published those a couple CVEs back in December, and, you know, I think that, like, again, another movie quote for you. Akhil, I just watched, a re-screening of The Matrix recently. And there's a difference between knowing the path and walking the path. And like, that's our role as a company, is to walk the path right and to just be that offensive cyber platform for the offensive cyber professional, whether they sit in government or in a commercial space. And I mean, we use this tool like it works. We're about to, we're about to run a big test, I will say, and have some are anticipating some pretty promising results. And so I think that, like, the types of things, like, it's just so fun working on this problem, because you just see our tech do amazing things every day and just new ways. So I think in the sense of like, has there been X, Y, Z impact from it yet? You know, I'm not sure I'm even best positioned to comment. We had, we had a conversation recently with someone who runs one of the biggest response firms in the world, and that person had some pretty interesting insights. And I think those people will share those insights in time. I can tell you, from our perspective, building these for team USA. They work and they're awesome.

Akhil 21:44

Patrick, two part question. One: does do large language models inherently favor the cyber attacker or the cyber defender? And if the answer to the question is the cyber attacker, will the defenses ever keep up?

Patrick 22:00

I don’t want to say I don't care about the question. We're here to make sure the US government excels in the fifth domain, right? That's our role in this. There's a lot of really smart people who have done really cool companies who are, you know, we met with some of these people are at open AI right now and and we talked, have talked to them, and they're continuing to talk to him like it's there's a lot of folks putting out a lot of smart thoughts on this. I think our role in this whole thing is to be the Ben Buchannan has another book called The New Fire, and he talks about these three groups when dealing with AI, is the Cassandras, who are warning about it, the evangelists, the people who are pushing it forward, and the Warriors. Were the warriors in this one. That's our role. And you know, again, we need some humility about that. We need to understand that it's not all about military applications of this stuff. But, um, our job is to push it forward. So I don't want to say I don't care if it favors attack or defense better or worse, our job is to make us good at it.

Akhil 22:56

No, that's great, Patrick, let me maybe reframe it. What do we do? How do we defend against adversaries, whoever they might be, and their own capabilities? And, you know, how, how worried should we be?

Patrick 23:09

I mean, and there's this other terms, like FUD, like fear, uncertainty and doubt, and like information security, and that's kind of like a theme, especially in cyber podcasts, right? It's like, big thing, be afraid. I mean, I think again for us, I'll give, I'll give basically the same answer is, like, I think that there are other firms. There's a couple firms that I really like that are unifying, like, so part of what we do is, like, unify static and dynamic analysis. We can look at source code, we can interact with the running application. We can combine those two things. But we're not the only ones doing that. There's lots of other firms doing that, especially on the defensive side, there's going to be some new cybersecurity defensive firms that are popping up that are helping us combat this. And a lot of those are actually incumbents that are adopting AI. Our role, though, again, is to make us good at it. So on the commercial side, like, part of that is like we're making our pen testers better, right? So like when we go out and again, we, you know, one of our pilots with XYZ pen testing firm right now, when they're supporting a Fortune 500 American company, it's our tech behind that that's helping those pen testers find vulnerabilities faster than, you know, another country. But I don't want to I think there's a lot of talk in the industry about, like, Oh, this is all defensive, and we're doing things for good. And then there's like, this sly, secondhand thing of, like, quietly, we're also doing offense like we're doing offense like, I just want to That is who we are as a company. That's what we're doing. And I think it's really core to just why we exist and the type of people we hire and recruit as well.

Maggie 24:34

Thanks, Patrick. So I want to shift gears a little bit and dig into a couple questions about how you guys are actually building this technology to understand a little bit more about what exactly Autonomous Cyber is doing that's really different than what other companies or researchers are doing in the industry. So, you know, I've played around with your guy’s product. It's pretty amazing what it's able to do. It really, definitely brought me back to my intro to computer security classes, learning about fuzzing and operating systems and web app security, SQL injections, you know, all that great stuff. So where does the tech work best right now? Are there specific classes of vulnerabilities or types of code bases or operating systems where it's particularly effective? And how are you guys prioritizing what to focus on there?

Patrick 25:22

Yeah, absolutely. Maggie, I love that question. So the short answer is, web application pen testing, right? Our product roadmap is fairly simple in that it's we just move towards classes of technology that these models are good at first, right? So it's like we started with source code only, and just like reviewing source code, looking at it for vulnerabilities, we moved to unifying static and dynamic analysis with source code and then actually interacting with a live web application. Some really cool stuff we're doing right now is black box web app testing, which means web app testing without source that's where we are right now, and that's also somewhat a reflection of the skill set of our team. We, you know, thanks to our partnership with shield, we are about to do some really, really cool stuff. And I'm really excited for basically the stuff we're about to build this year. So it'll be fun.

Maggie 26:13

Patrick, it seems like every other pitch that Akhil and I get these days has something to do with, you know, AI and cyber and the future of cyber operations. What makes your guys tech special and different than what other people are building?

Patrick 26:26

Yeah, Maggie, I appreciate that, and I think there's two answers there, and they're both related to really vision of who we're building for and what we're building. I think one a lot of companies fall in the spectrum between autonomy and augmentation. And companies that are building either defensively or for largely like enterprise people that are selling to CISOs, broadly, they're building things where there's a huge incentive to automate, right whether it's a software development lifecycle integration, whether it's an attack service management tool, a lot of times, like CISOs and these individuals, they want something that's a dashboard. They want something they can see. They can see. They can get system health status and so on. You know, no one that I know that works in InfoSec look as an end user, as a penetration tester. No one's using a tool with a dashboard, right? So I think part of the big thing I say to our product team is like, if we built a dashboard, we've lost our way. We're so so far on the augmentation side, like we're really building for that core end user. And then what that means in practice is there's certain demands that puts in our tool. So number one is interactivity, right? It has to be something that a human can steer, a human can drive, and has, like, a high degree of human interaction, and that comes at the expense of being able to build these autonomy features. And that's how we kind of find our niche of the market. Number two, and this comes back to philosophy and vision. On this is that fundamentally, the most important thing when you're talking about dealing with a system or a program under test is the interface between the large language model and the program under test. You have to find a way, not only, to go from the LLM to the system. That's fairly trivial in some ways. So it's like you have a React agent, where you have, basically, you know, a thought and then a command, and then you issue a command to a tool which interacts with the system. That's that's fine. What's really difficult is representing information back from that system to the LLM in a flexible and sustainable and robust way. That's actually a super, super hard engineering challenge, and it can be something as simple as, like, you know, there's different types of terminals. There's these very stateful terminals where you issue a command, you get something back. But it doesn't always happen that way. Sometimes you can just have a stream of information, and representing a stream back to the LLM is hard. It's really hard on a browser, right? There's like, these open source tools where it's like, Okay, we just hook up an LLM through a browser. But if you want to do sophisticated things in a browser, like representing how a web application has changed back to an LLM, and always giving it the right information and giving it enough information so it doesn't get stuck in loops, but not just dumping, literally filling up the entire context. Was there more than it with some change in the HTML, all that stuff is actually incredibly difficult, and it's something that I think that's the main thing that's holding back progress right now in this field. It's something that we're dealing with, that our good competitors are dealing with as well, and so it's really all about interfaces. It's about how you change, how you exchange information between the LLM and the system under test, and that changes depending on the type of system, whether it's compiled code, whether it's web application source code and so on. So for us, I think we just have this vision that it's a space. I think we're the only ones occupying it right now. Think more will join. But where we're trying to build human augmentation and these interfaces, and that's fundamentally what we're doing at Autonomous Cyber is different than most other folks.

Maggie 29:31

It's been wild to me just how fast new models are coming out and how they seem to be beating benchmarks every other month. Are there any that you've found have been particularly useful for your product, or any that didn't really live up to the hype?

Patrick 29:44

Yeah, I'll say that sort of interesting thing here is, like fine tuning, right? So I think by and large, and this is, this has been many people's experience, so we have, we have a research partnership with lab within the Computer Science and Artificial Intelligence Laboratory at MIT, and I was talking with one of the researchers there. That's. Okay, what are your thoughts? Like, should we fine tune? Should we use a frontier model? And he was like, Dude, no one knows. Like, he's like, no one knows what's better or worse. And like, yeah, there's some stuff you can tell about XYZ specific case. So I think what's been sort of the most surprising for us is just like, how fast, I mean, it's for everyone, right? Just how fast the frontier keeps moving out. And I think every time we try to do something, I think we've been not burned, but I would say, like, we felt the heat of the bitter lesson a few times. The bitter lesson being that, like, general methods overtake specific methods in the long term, and it's just kind of trying to stay balanced on that frontier while not getting too honed in on something specifically. And then, I mean, it's crazy. We're a seed series company, but like, even literally yesterday, we're gonna have to burn down a lot of our code base and build it back up for again, there's a specific standard that's coming out and just all these new tools that have been released. So like, you have to be moving so fast in this space that it's not really so much specific models as it is. Just like, there's so much being built around these models that it's a really fun space to be in, but it's something you have to have your head on a swivel for.

Maggie 30:59

Any thoughts on some of the models coming out of other countries, you know, maybe in particular, maybe in particular, deep seeks are one model.

Patrick 31:05

Well, why that one in particular? Maybe, yeah, I mean, I think I'll call back to Nicole's book here. So This is How They Tell Me the World Ends. And she does a good job of talking about, kind of, like, the culture and history behind a lot of the fifth domain, specifically in America. And there was this idea of, like, no bus or nobody but us. And, you know, Nicole talks about it for a variety of reasons, but that basically was bunk 10 years ago, and it's even more bunk now, right? Like, just the idea that, hey, we should hold up and work on these things in a completely, like, cut off from the world way, and that will sustain some form of advantage. I mean, it's comical. And so I think that, like the again, without tending trending too much into the fear, uncertainty and doubt, like the idea that this is going to be done quietly, privately and disconnected from the world is not it's not a plausible way to get to American advantage in the fifth domain. So my thoughts on some of the models coming out from other countries is, and it's not just the models. I mean, there's there's versions of our product that are built in other countries and advertised. And I think anytime anyone who's seriously worked in national security like it requires a degree of respect for the adversary or target, or whatever you want to call it, and I think that it's really important to be cognizant of their strengths and weak and weaknesses. But let's not just be super arrogant about Oh, we're so great at all this stuff, and no one else is going to be good at it either. It's important to understand, like the support is half respect for the other side too.

Maggie 32:41

How do you guys actually benchmark the performance of your system?

Patrick 32:45

Yeah, it's sort of interesting. So I think, you know, in the early days we were thinking about this of like, okay, we'll try these, hack the box. Will be very easy, easy-medium and so on. Um, you know, we have a, there's another AI company that I like a lot, and we, and we've talked to them, and we're doing something slightly different than them, I believe. And, you know, they have $20 million seed, and they have a really good benchmarking system. I don't think we actually have the capacity to do that. I think for us as a startup, like, we just have to be moving so fast that, like, where we benchmark is on user uplift, right? Like, I'm a huge fan. And like, you know, I think my employers will probably risk me for this, but it's like, build product. Talk to you. Product, talk to the users like, it's just like, it's a Sam Altman phrase, and we just use it all the time. And I think, like, where I benchmark and where I feel about how we're at as a company is what our users are saying to us. So it's like, I think that's really what motivates us and what keeps us moving and what keeps us moving quickly. So we don't have, like, you know, benchmark on, you know, 31 out of 34 ports, figure labs, we are really focused on, like, hey, how do our users value this? Like, what value are they getting out of it? And that's, it's hard to quant, you know, I'm a math guy by trade. It's hard to quantify, and I think it's probably shouldn't be quantified.

Maggie 33:58

Another question I want to ask, sort of on a similar line. How have you seen the challenges of actually deploying this kind of technology differ when working with a DOD customer versus working with a commercial customer?

Patrick 35:16

Yeah, well, Maggie, that comes back exactly to what we're talking about with dual use. It's It was so funny. So we had our first meeting with one of the big four, right? So we're meeting with one of the big four, and we were meeting with a pen testing shop and one of the big four, and literally, the first thing they asked us for was an on prem solution that's like, oh. And so, like, it's it is crazy how similar the needs are and how similar the roadmap is. It is absolutely wild. And again, it comes back to that conviction of dual use is the appropriate strategy for this company. Yeah, it's bizarrely similar. I'll say that. And do these customers that you guys are working at actually have the infrastructure that they need to deploy large language models in this way on what is ultimately sensitive information for them? So all. I'll talk about the commercial side. So sometimes, yeah, and that's actually a big qualifier for us, it's like, Hey, do you have an LS like, for this customer with the big four, right? Like, do they have an LLM that can that they can do on client engagements? And the answer for them is yes. It's not always the case for commercial partners, but usually what it is is like, Okay, we stand up our on prem survivor environment in their cloud, and then point that in an LLM, that they're able, like, if they're on AWS, right, like, pointed at an LLM, that they're able to stand up in AWS. So generally, yes, not always, oftentimes, we're the first, or one of the first uses of LLMs that they have in a particular environment on the commercial side. And I think that that's a really good qualifier for us when we look at the sales process of like, okay, are they ready for this? Do they have an LLM in place? Because, you know, there's other things you would use an LLM for, just like chat bots are super useful, right? And so that tells us, like, okay, they're ready for this.

Maggie 35:52

So one of the things I find pretty interesting about Autonomous Cyber is that not only are you guys building for, you know, maybe the lower level cybersecurity analyst, but you're also building for really expert users, right? Some of these penetration testers have extremely unique skill sets, decades of experience. So how do you guys think about building a product that really serves both of those kinds of use cases?

Patrick 36:17

Yeah, no, Maggie, that's awesome. So we're doing one of our alpha users is a guy who makes his living on bug bounty, which is like, crazy if you know, like, how good these people are to be able to support themselves their families on bug bounty. I mean, it's you have to be good, and you're and you're really betting your income and your skills on that. And so for him, like the model is not super human, or, like, our platform is not super human in the sense of, like, okay, it'll send requests that just don't make sense sometimes, or like it should, a human would realize that there's not a real vulnerability there, because there's some conflicting information in the request. And as the models get better, as we come to GPT five or six or seven or whatever, that's not gonna be the case anymore, but it still is the case today. So our platform is not yet superhuman in the sense of being beyond an expert pen tester. But the way that person uses our platform is so and this is getting into more of our tech. We have what we call multi agent, and I think of it like multi ball and pinball, right? So that person is able to federate basically his methodology across right now, we're just doing it with 10 agents. And so, like, our token costs are pretty I mean, I wouldn't say extreme, like, tokens aren't that expensive, but, like, we're using a lot of tokens, and we have of tokens, and we have 10 of these agents running simultaneously in the background. So and the case the expert user is just, he's using this in a way to just go faster, right? And just find things and get informations about different parts of the system and conduct more involved reconnaissance on the bug bounty programs. He's doing in a faster way. And our goal with him is to reach something that, I think so identify him, if I'd say the exact term he uses, but is to reach this point where, you know, basically the bounties we get are greater than the token costs, and I think we're very, very close to that. And you're going to see a real change in the way bug bounty works this year, because us, and probably some other people, are going to get to that point where you can just point where you can just point these things at bug bounty programs, and it just shakes them out like just, I don't know, pick a metaphor, shaking something out of a tree. So that's the expert side. Is it's multiple runs simultaneously, which, from a UX perspective, is very difficult, very, very hard to do. And it comes back to what I was saying earlier about us being an augmentation company, where being able to expose that information to the user in the right way. When you have 10 agents all running and not just have it be this automated scan, it's a very difficult UX problem and something that, again, it goes, it also goes back to our staffing model, like why we need top, top people? Because you can only do that with top people, which, which our people are, which is awesome on the on the more beginner user. I mean, I think that this is a really cool part of our platform where, because you can interface with this thing through natural language, it offers a really nice bridge to beginner users. So it's something where you have whatever level of expertise you have, but you can talk with our agent about why it did something, why it found something if you want to go to go to a different place, and you can do that all in natural language. So I think that, like the building for both customer sets has been a lot smoother than I think I had anticipated when we started, but I think the combination of the natural language interface and the ability to go multi agent is how we bridge that gap.

Bohdan 39:13

Yeah, I just want to add my two cents in real quick, as the proudly least technical person in the company, as a former infantry guy who recently got into cyber and now I'm using these tools to help find zero days in pen test applications with like, no formal training. We're also right now working potential partnership, deploying this in a training pipeline so to get like, the very like the infantry sergeant that just v tip had their first day, like they're learning that the A to B to C step, and now the culminating exercise looking at like, using, like, advanced AI tools that we built to help them actually conduct the operation. Because these people, like, they don't care about just getting a certification. They care about delivering an effect in an operation. Patrick,

Maggie 39:52

so you said the tech today is not superhuman. What will it take to get to being a. Fully autonomous penetration tester. Or, you know, where do you think the tech will be in five to 10 years?

Patrick 40:04

Yeah, I mean, I think on the commercial side, you know, some companies, some CISOs, want a fully autonomous tester. I I don't think the market's actually there to support that. You know, it's there to an extent. I think the better place to go fully autonomous is a software development lifecycle integration. And I think the companies, I think the companies that are doing that on the defensive side and the commercial side, I think those are going to be the ones that succeed. I think that a lot of the structure of this industry depends on having a human in the loop, and whether that's just communicating results to someone, whether that's, you know, making sure you don't take down prod right, like, there's still, like, there's definitely issues like our tool. The other day, we were doing a bug bounty testing a SQL injection endpoint, and the test query that our agent used was DROP TABLE users, which is not what you want. This is in production. This is not what you want in production for like, a test, it's SQL injection query. So I think that, like, and that's not to say I think, you know, the counter argument to that is like, Oh, these models will get better, and then they can be fully better, and then they can be fully autonomous, like they could be. But would we want them to be? And I think that there's always, I think the much better play, if you're trying to protect your company, is to continue to hire the same people. You're hiring the same firm, same external pen testers, internal testers, if you have them, but just support those people and supercharge those people. And I think it comes back to a lot of our vision as a company is, like all a lot of technology and a lot of the fifth domain really is fundamentally about people and information. And I think that often gets lost in this kind of like cyber, computer, whatever. But, you know, we are building a tool to support people, and I think that a lot of the incentive structure about why people are involved in this in the first place won't change with AI,

Akhil 41:49

Thanks, Patrick. Speaking of the users, I want to come back to the user journey, how the users and their organizations are getting value a year from now and five years from now, what do you want users and the leadership of those organizations that are using Autonomous Cyber to say about how it worked, how it enabled their users or filled gaps in their organization?

Patrick 42:13

Yeah, great question, Akhil. I think the overarch answer for that is we want FUZZ-E to be plugged into every part of the workflow, so that should be the first tab they open Monday morning, whether they're a vulnerability researcher trying to find zero days, trying to string together CVEs for new remote execution, or whether they're actually trying to throw those in a network through an operation. I think the broader like the five year vision is really integrating also a command and control aspect to it, because right now, all these different systems and people are not talking they all want right now, just custom solutions for a very small set of problems that honestly just gather dust, virtual dust, for most the time. So what we want to do is want to bridge that gap, and really it's a communication tool between the end user and the leadership that we want to have. And so the leadership, the mission owners, can plan the missions, can off, can set tasks to different units, and then ultimately control the whole operation.

Akhil 43:07

Thanks. Bohdan, as you've experienced and certainly we know as well, adoption of new technology within the government takes time. It can be slow. You run into things like continuing resolutions. How do you actually get your ultimate users to believe this thing actually works, and then have the advocacy internally and the champions internally to help push it forward when it comes time for budgeting and everything else?

Bohdan 43:35

Yeah. Another great question, Akhil, you are 100% right selling into the DoD is selling into a customer base that does not trust easily. And our customer in particular, trusts outsiders even less than the average. And this is an industry where you have to be in person like you're not going to form any of that trust over zoom or teams. You have to be flying out to meet the customer and the end user as much as possible. You have to sit side by side with them if they actually do their job and understand what they go through, like just sitting in the conference room or a meeting like, that's not going to cut it. When Patrick and I were first starting out, it was really about kind of leveraging other networks that we built in the service and in the national security space. We were taking meetings like anywhere we can get them, in most cases, off base, meeting with anyone that was willing to see a quick demo, or to talk about major pain points and priorities, and from there, it was about aligning our product development to solve a top three urgent need that we found out that they had so to bring it back to like the building trust aspect, the first part you have to do is be committed to the actual mission. It's not about building a specific product, but more about solving a problem that the DoD service member has. Once they start to feel like you both are on the same team and are working together to provide mission wins for the unit, is when the actual great relationship can begin. The second part I say, is actually about delivering service members like especially now get pulled in like hundreds of different directions. We. Different tasks. A lot of them have actually like at this point. Have heard many pitches and have seen many companies come and go trying to sell into their area. You want to be providing actual value from the first time a user gets their hands on your technology. You don't just want to talk about it the entire time, or at least, you want them to be able to see the potential of where you are headed. I'll give an example. I remember when we first released what we would call our MVP to users, uh, last November, one of them said, and I quote, it felt magical when they got to see our technology in action. And the great part is he was directly contributed. Like he was a direct contributor to that state of the product, like just three months prior, when we had our first batch of DoD users be able to, like, try our platform for the first time, his direct feedback influenced the direction we took the product. Our first pre MVP, pre MVP version was completely autonomous, and the first piece of feedback we received from him was it had to be iterative. I need to be able to stop, steer and intervene at any point. And so when we delivered that next iteration with a bunch of modifications and included his feedback, that was not something I think he has ever seen before in his DoD career. And so to kind of tie it all together, it comes down to listening to the customer, what they actually need, plugging into their daily workflows and delivering the capability you promised. If you do that, you're going to win over even the hardest of skeptics in your user base, and they'll come to bat for you when it matters. And ours already have several times already. That's awesome. But, and I think a good broader point of emphasis around the use of any type of sort of automated or AI enabled solution is the human centric. It can be it can be a human in the loop of AI or AI in the loop of a human, but ensuring that there is that ability for human influence and human shaping. Because ultimately, Patrick, to your point, we're talking about users, as you mentioned, and individuals whose workflow we're enabling. And that goes far beyond Autonomous Cyber, but certainly a core component of it.

Akhil 46:21

As we wrap up here last couple questions, guys, biggest surprise building Autonomous Cyber anything you didn't see coming or you would tell yourselves as you were starting this journey.

Bohdan 47:05

So honestly, I think I've been surprised with how fun it's been. Like leaving the military. I know it's kind of a corny answer, but leaving the military, I always assumed that I was destined for another like, bureaucratic position at a large company, and the more exposure I got a startup since leaving active duty, and like, the more I was pulled towards this this world where you just have full freedom to build a product and a team how you see fit. I mean, there's definitely, don't get me wrong, like, long days and nights involved, and there's definitely plenty of stressful times, but I don't think I've actually ever felt like I was working, like we have a great team who I love working with. Everyone is highly motivated on the mission and has full autonomy on how they want to tackle problems. And then there's the customer. And I think the best part of my job is when I get a fly out and see them to showcase product progress, get them hands on the product and receive that direct feedback. I'm lucky to get to work again with some of America's finest and help contribute to the reason that the US and allies stay ahead of adversaries in offensive cyber.

Patrick 47:58

Yeah, I think that's a good answer. I think mine's kind of the same. I think it's the people who have stood up and said, like, I think like, we've heard some versions quote unto like, I will do everything in my power to help you. I think that, like, you know, it's hard to do this. Bo, I think you've traveled what like, I think every week for the past 12 weeks. I think last week, it's just for the purpose of this podcast. Let's say it was a bit of a cluster with a specific thing that was happening. But, you know, I think you had two red eyes, three red eyes that week, and just like a bunch of surprise travel. So I think it's hard to do, but I think that kind of a lot of the allies that we've picked up along the way, there's been a lot of resonance with this mission, in terms of people who are in this space, and it's been a really good and nice surprise to kind of see like, I think, especially again, coming from the national security community, you're in often windowless rooms and these small pockets, but to kind of see how broad based this call is for really being competent in this domain, I think that's been really cool to see, and that's been a really nice surprise about building this.

Akhil 48:54

That’s awesome, Patrick. Let me just turn internally to your own team. First question, Patrick, what's Bohdan’s superpower? And Bohdan. What's Patrick superpower?

Patrick 49:03

I was joking about this yesterday in an event we have, I think that week, I was talking about Bohdan last week. I think, you know, for vets who are looking to transition, I think it's probably the most directly applicable use of infantry skills I've seen in, like, in a job where it's like, I mean, his job is basically just to suffer a lot of ways. I mean, he is on the road, like he's traveling, and I think that like when and he's like, I see but on about as much as I see my spouse. Like, honestly, it's like, you just spend a lot of time with your co-founder. So I think it's just like, it's been really nice to just see someone just like, absolutely grind relentlessly, positively, not freak out and like, I know like, like, I think soft skills are often overrated in a lot of ways, honestly, but Bo like, is just, is the perfect person for this, and I think it's been, it's just been really incredible to work with him on this. So that's, that's the part I would say for that.

Akhil 49:59

Yeah, thanks. Thanks, Patrick. You know, usually us prior infantry officers have a tough time finding use for our intangible skills, Bohdan.

Bohdan 50:08

Glad to know those red eyes are being put to use, and hopefully something will, good will come out of it. But I'd say for Patrick, like, his superpower really is kind of, like the vision and like, that's kind of, I mean, when we first met at business school, he was talking about this like crazy idea he had. And the more and the more and more I talk with him, I'm like, Okay, this is he's not thinking it like next year. He's thinking for the next 5 to 10, years. And you know, one of the taglines that he likes to use is, like, we're building for the coming decade of cyber operations. Like he will see, like, just a napkin architecture or of a workflow. And he'd be like, that's exactly what operation is going to be like for the next 10 years. And so I think for us, like, especially on the engineering side, like, I think the vision that Patrick has when building this company, like, really is not building for like, the short term. It's about building like, sustainable advantage that we will have, like, in the future, not just like with our tech, but like, how we approach the strategy, like, which customer base should we go after it and so on and so forth. So again, I'm really, like, proud to work with him. Like, he's been an awesome like, not just like, you know, co-founder, also a great friend, totally.

Akhil 51:10

Bohdan, Patrick, who else you looking to bring on the team?

Patrick 51:14

I mean, I think in general, we have soft commits from our next five hires right now. I'm, like, a huge believer, and I let like Walter Isaacson is another one my favorite authors. And I think there's a lot of things to not idolize about Steve Jobs, but one thing I definitely took from a Steve Jobs biography is the team that built the Macintosh, and just this, like, small team of absolute A players. And I think that that's our company culture is, I mean, we, everyone says, Oh, we do best of the best. I mean it, we only do best of the best, and we have to based off of how we operate and how we work. Don't find us. We find you. In all instances, that's largely been true so far. So I think in general, folks who have who care about this domain deeply, who just really, really care about this mission, and who have a lot of competence in general, those those hires skew a little older. You know, our youngest hire. He's like our Gen Z person on our team. It's like he does, he does a really good job, but, and he's fantastic. But I think in general, we're looking for people who really care about this, who have experience in this domain and are opinionated. You know, some of our friends, they'll put out, like, job solicitations that they look for low ego people. And that's not true at all. It's like, we definitely look for people with a little bit of ego, and like me and our founding security engineer, like we're definitely getting into it yesterday on our product architecture, and it's like, that's what we want. Like, I want someone who wants to tell me, it's like, Nah, man, that's bunk. And again, I'd probably use another word than bunk. Weren't for this, weren't going out on a public podcast, but, and the same thing true. I think, you know, we're, I think one of the roles, the one of the roles we don't have a soft commit for that we're going to be hiring for soon as someone to really build in a lot of some of the AI feature sets to kind of improve some of our fundamental AI stuff. And like, we're gonna want someone for that role that comes in and tells me what we should be doing, right? So I think in general, people in this domain, people who care a lot, and people who do have a bit of an ego, like people who you know are here for a reason and have an agenda to push. So I think that's who we look for.

Maggie 53:18

Patrick and Bo, what are your guys' most controversial takes about cybersecurity and AI and cyber operations.

Patrick 53:39

Maggie, you're gonna you're get me in trouble. So, I think you can win a war in the fifth domain. That's my most controversial take. I think that the way people talk about this isn't right, stop there, but I think I think you can win a war in the fifth domain. What advice do you guys have for other startups building for the national security community? I think that, like, for me, I think Bo, you probably get the same answer is like, you know, I think definitely had some understanding. You know, was in for a while, went to business school, is thinking about the space for a long time. You know, have known you all a shield for years. And, you know, had many of these conversations. I think I knew the government customer was going to be a difficult customer to sell to, and like, connect the dots, I think I didn't understand how hard that is, even when you have insane enthusiasm from the end user. I think it's been we've had a couple instances so far where we have end users and a shop and like, command up to a certain point that absolutely loves what we're doing, and it's still been hard to really connect the dots on the sale. And I think that, like, I kind of thought that I was like, Oh, you just have like a boat on, and then like boat on takes care of that, and it's fine, and which, in practice, is how it works. But like boat on taking care of that has meant a lot a lot of work on his part. So I think. Think my advice for starts building national security is like, I think it is even harder than people say it is having this vision. Like, I won't name the investor, but there was a specific chunk of a tier one investor who worked in the defense tech they were literally the defense tech investor for a tier one. And when we pitched them, this is, you know, I think a year ago now, they told us no, because they didn't like CYBERCOM as a customer. And like, that's the defense tech component of a tier one. So like, I think the national security spaces is much, much harder even than you think it is. Is what I say, even to people who are insiders. That's my answer.

Bohdan 55:38

Yeah 100% Patrick, I agree. The one thing I do want to add on, though, is that the market is different, but the fundamentals remain the same. Build product, talk to users. Everything else is secondary, and we're talking to users. Make sure that you're actually plugging into the workflows and observing how they actually do their jobs. Again, not just in conference rooms for like meetings with like 30 people. Also, as soon as you're able to afford one, hire a government relations firm. Innovation funds are all well and good at the pre seed, maybe even to the seed stage, but you have to start early in order to align yourself with the POM and the FYDP cycles. The DoD budget operates on three to five year time frames. And if you want to capture significant line items in the 2030 NDAA, you have to start getting before staffers on the Hill today.

Maggie 56:21

Great. Well. Thank you guys so much.

Akhil 56:23

Yeah, it was awesome to be here. Thank you. Thanks so much. This was fun. Thanks guys, bye.

Discussion about this episode