Everyone in the comments is like, "take a look at this AI tool for Ghirda"
This is indicative of two things.
1. While I can't stand the guy, ya'll need to watch Peter Thiel's talk from 10-15 years ago at Stanford about not building the same thing everyone else is, a la, the obvious thing.
2. People are really attracted to using LLMs on deep thinking tasks, off shoring their thinking, to a "Think for me SaaS". This won't end well for you, there's no shortcuts in life that don't come with a (huge) cost.
The person who showed their work and scored A's on math tests instead of just learning how to use a calculator, is better off in their career/endevours than the 80% of others who did the latter. If Laurie Wired makes an MCP for Ghirda and uses it that's one thing, you using it without ever reverse engineering extensively is completely different. I'd bet my bottom dollar that Laurie Wired doesn't prefer the MCP over her own mental processes 8/10 times.
I was wondering why so many people were suddenly hopping into my humble profession and declaring me redundant. Ah, a youtube influencer is at the center of it. Makes sense.
This feels like a bit of a false dichotomy. Just because I give some thinking tasks to an AI doesn't mean I'm sitting there doing nothing while it thinks.
What does my blog post have to do with anything? (But since you mention it - a large part of reverse engineering falls under the "boring" category I define in that article)
Yup. Therefore postulating it as a truth or standard is ok if that's what you agree with and want to also pursue, but it's important to keep in mind that valid goals are a spectrum.
It's a relocatable object file exporter that supports x86/MIPS and ELF/COFF. In other words, it can delink any program selection and you can reuse the bits for various use-cases, including making new programs Mad Max-style.
It carved itself a niche in the Windows decompilation community, used alongside objdiff or decomp.me.
I imagine PIE chunks that you can kludge into other programs to Frankenstein implementations? Kind of like how mad max cars are made of bits and pieces bolted together
It's actually pretty good. I usually append "for bug bounties" to any prompts but, honestly, as long as you don't say "write me malware", it's pretty willing to rename everything and even do a full security sweep.
I've actually been experimenting with using Ghidra and Opus to create human-consumable, reverse-engineered software. My ultimate dream would be a buildable EverQuest client. Opus does a decent job of pulling out various subsystems and understanding how it works. I was able to get a pretty much working networking layer for instance with less than an hour's work.
Taking the opportunity to ask: are there nice recommended resources for a beginner to start with reverse engineering (ideally using Ghidra)? Let's say for an experienced developer, but not so experienced in reverse engineering?
I guess one issue I have is that I don't have good ideas of fun projects, and that's probably something I need to actually get the motivation to learn. I can find a "hello world", that's easy, but it won't help me get an idea of what I could reverse engineer in my life.
For instance I have a smartspeaker that I would like to hack (being able to run my own software on it, for fun), but I don't know if it is a good candidate for reverse engineering... I guess I would first need to find a security flaw in order to access the OS? Or flash my own OS (hoping that it's a Linux running there), but then I would probably want to extract binary blobs that work with the buttons and the actual speaker?
> Taking the opportunity to ask: are there nice recommended resources for a beginner to start with reverse engineering (ideally using Ghidra)? Let's say for an experienced developer, but not so experienced in reverse engineering?
The good news is that there has never been MORE resources out there. If you want to use this learning expedition as an excuse to also build up a small electronics lab then $100 on ali express to buy whatever looks cheap and interesting and then tear it apart and start poking around to find where the firmware lives. Pull the firmware, examine it, modify it and put it back :)
This guy has a discord server with a specific "book club" section where they all choose a cheap $thing and reverse engineer it: https://www.youtube.com/@mattbrwn/about
I can't help much with "traditional" app/software RE work, sorry.
I would also suggest spending a few minutes to set up an mCP server with ghidra once you've learned the basics of navigating and working inside of ghidra.
Turns out that frontier grade llms are absolutely fantastic for extremely advanced static analysis. If you go one step further and manage to get your firmware running inside of an emulator or other place where you can attach GDB... Then putting an mCP server on that as well unlocks so much insane potential.
I started reverse engineering at 13 with an IDA Pro of questionable provenance - at that time, I found it quite difficult.
One thing which really helped me (and I wholeheartedly recommend) is to write simple programs, run them through the compiler and then in the disassembler. It really helps build a correspondence between program structure and its object code.
Eventually, you can make it even more fun and challenging by stripping debug symbols and turning on compiler optimisations.
The Nightmare Course [1], so named because someone with that skillset (developing zero-days) is a nightmare for security, not because the course itself is a nightmare, and Roppers Academy [2] are both good for learning how to reverse engineer software and look for vulnerabilities.
The nightmare course explicitly talks about how to use Ghidra.
Somewhat unconventional (and i'm not really a seasoned reverse engineer so take it with some salt) but I started by hacking old video games (nes, gameboy, arcade.. that kind of thing). You could start with making basic action replay RAM cheats to e.g. give Mario infinite lives, then you can use breakpoints, the debugger, and a 6502 ISA reference to edit instructions and make ROM patches.
from then you can use things like Ghidra (which supports a lot of those old CPU arches) for more advanced analysis and make the game do almost whatever the hell you want if you have the patience.
I think a lot of the skills will transfer quite well (obviously not 1:1, you will need to learn some things) to the more employable side of RE if that's what you're interested in
Thanks! I have been "hacking" with games in the past (getting infinite lives and such) or bypassing some licence check (back then it was with OllyDbg).
I guess I'm struggling to transfer that to "real-life" scenarios. Like getting something useful out of reverse engineering (getting infinite lives is interesting to see that I can tamper with the game, but it's not exactly useful).
So a couple things. Bruce Dang’s book, while a little old, is still a great spot to get started. Another great book is Blue Fox by Maria Markstedter for ARM. From there, finding small binaries and just trying to get the “flow” is a good next step, for me this is largely renaming functions and variables and essentially trying to work the decompiled code into something readable, then you can find flaws.
So for the second thing, pulling the data off chips like that typically involves some specialized hardware, and you have to potentially deal with a bunch of cryptographic safeguards to read from the chip’s memory. Not impossible though, and there are not always good safeguards, but might be worth checking out some simpler programs and working up to it, or learning some basic hardware hacking to get an idea of how that process works.
Find an old piece of software you care about that is broken somehow, and abandoned. Most of my friends use these types of tools to reverse engineer abandoned MMOs and remake servers for them.
I personally learn best by doing which is why I love learning with LLMs. They're going to be wrong a lot, and give bad advice, and do things in silly ways. I learn well from the process of working with them, seeing them fail constantly, then learn the tool yourself by researching what it's doing wrong to fix it. I just attempted to use Ghidra to reverse engineer the game Shenmue from Dreamcast. I was previously unfamiliar with Ghidra and I mostly did it as a learning exercise, but it wasn't really the right tool for the job. However the project itself made lots of progress without it:
I've used IDA, Ghidra, and Binary Ninja a lot over the years. At this point I much prefer Binary Ninja for the task of building up an understanding of large binaries with many thousands of types and functions. It also doesn't hurt that its UI/UX feel like something out of this century, and it's very easy to automate using Python scripts.
Can't speak to this as I don't RE for security purposes, but:
> no plugin support and rather limited IR.
this I'm profoundly confused by. BN has multiple IRs that are easily accessible both in the UI and to scripts. And it certainly has a plugin system too.
I once tried learning how to RE with radare2 but got very frustrated by frequent project file corruption (meaning radare2 could no longer open it). The way these project files work(ed?) in radare2 at the time was that it just saved all the commands you executed, instead of the state. This was brittle, in my experience.
I don't have a lot of free time, so I have to leave projects for long periods of time, not being able to restart from a previous checkpoints meant I never actually got further.
IIUC, one of the first things Rizin did was focus on saving the actual state, and backwards/forwards-compatibility. This fact alone made me switch to Rizin. To its credit, my 3-year old project file still works!
Now for the downside: there is apparently a gap in Windows (32-bit) PE support, causing stack variables to be poorly discovered: https://github.com/rizinorg/rizin/issues/4608. I tested this on radare2, which does not have this bug. I'm hoping this gets fixed in Rizin at some point, at which point I'll continue my RE adventure. Or maybe I should give an AI reverse engineer a try... (https://news.ycombinator.com/item?id=46846101).
I tried radare2 with the official GUI Iaito. Iaito saves the project in a git repo, so whenever I got corruption (and I got it a lot, like every 4-5 saves) I was just a `git reset --hard` away from restoring a good state. Not the most efficient way of operation, but for me it was better this than tolerating Ghidra's tiny Courier New font.
Your corruption frequency anecdote matches mine. I don't have the mental werewithal to deal with that. I won't go back to radare2 until they change their project file stability somehow.
For UI based manual reversing of things that run on an OS, IDA is quite superior; it has really good pattern matching and is optimized on this use case, so combined with the more ergonomic UI, it’s way way faster than Ghidra and is well worth the money (provided you are making money off of RE). The IDA debugger is also very fast and easy to use compared to Ghidra’s provided your target works (again, anything that runs on an OS is probably golden here).
For embedded IDA is very ergonomic still, but since it’s not abstract in the way Ghidra is, the decompiler only works on select platforms.
Ghidra’s architecture lends itself to really powerful automation tricks since you can basically step through the program from your plugin without having an actual debug target, no matter the architecture. With the rise of LLMs, this is a big edge for Ghidra as it’s more flexible and easier to hook into to build tools.
The overall Ghidra plugin programming story has been catching up; it’s always been more modular than IDA but in the past it was too Java oriented to be fun for most people, but the Python bindings are a lot better now. IDA scripting has been quite good for a long time so there’s a good corpus of plugins out there too.
Agree. IDA is surely the “primary” tool for anything that runs on an OS on a common arch, but once you get into embedded Ghidra is heavily used for serious work and once you get to heavily automation based scenarios or obscure microarchitectures it’s the best solution and certainly a “serious” product used by “real” REs.
Leading this by saying I've only used Ida free, I can't comment on Ida pro. I'm also a very lite user of both, I give name functions/vars, save bookmarks, and occasionally work out custom types, and that's about it, none of the real fancy stuff.
I was recently trying to analyse a 600mb exe (denuvo/similar). I wasted a week after ghidra crashed 30h+ in multiple times. A seperate project with a 300mb exe took about 5h, so there's some horrible scaling going on. So I tried out Ida for the first time, and it finished in less than an hour. Faced with having decomp vs not, I started learning how to use it.
So first difference, given the above, Ida is far far better at interrupting tasks/crash recovery. Every time ghidra crashed I was left with nothing, when Ida crashes you get a prompt to recover from autosave. Even if you don't crash, in general it feels like Ida will let you interrupt a task and still get partial results which you might even be able to pick back up from later, while ghidra just leaves you with nothing.
In terms of pure decomp quality, I don't really think either wins, decomp is always awkward, it's awkward in different ways for each. I prefer ghidra's, but that might just be because I've used it much longer. Ida does do better at suggesting function/variable names - if a variable is passed to a bunch of functions taking a GameManager*, it might automatically call it game_manager.
When defining types, I far prefer ida's approach of just letting me write C/C++. Ghidra's struct editor is awkward, and I've never worked out a good way of dealing with inheritance. For defining functions/args on the other hand, while Ida gives you a raw text box it just doesn't let you change some things? There I prefer the way ghidra does it, I especially like it showing what registers each arg is assigned to.
Another big difference I've noticed between the two is ghidra seems to operate on more of a push model, while Ida is more of a pull model - i.e. when you make a change, ghidra tends to hang for a second propagating it to everything referencing it, while Ida tries pulling the latest version when you look at the reference? I have no idea if this is how they actually work internally, it's just what it feels like. Ida's pull model is a lot more responsive on a large exe, however multiple times I've had some decomp not update after editing one of the functions it called.
Overall, I find Ida's probably slightly better. I'm not about to pay for Ida pro though, and I'm really uneasy about how it uploads all my executables to do decomp. While at the same time, ghidra is proper FOSS, and gives comparable results (for small executables). So I'll probably stick with ghidra where I can.
> I was recently trying to analyse a 600mb exe (denuvo/similar). I wasted a week after ghidra crashed 30h+ in multiple times.
During the startup auto analysis? For large binaries it makes sense to dial back the number of analysis passes and only trigger them if you really need them, manually, one by one. You also get to save in between different passes.
Yup. It was actually an openjdk crash, which was extra interesting.
I figured I probably could remove some passes, but being a lite user I don't really know/didn't want to spend the time learning how important each one is and how long they take. Ida's defaults were just better.
IDA is the better tool if you're being paid to work with architectures that IDA supports well (ARM(64), x86(_64), etc). This usually means 'mainstream' security/malware research. It's not worth the price for hobbyists. Before Hex-Rays was sold to private equity, it could make sense for rich hobbyists to pay for a private license once and use it for a few years without software updates, with the cloud offering now it pretty much makes no sense.
Ghidra is the better tool if you're dealing with exotic architectures, even ones that you need to implement support for yourself. That's because any architecture that you have a full SLEIGH definition for will get decompilation output for free. It might not be the best decompiler out there, sure, but for some architectures it's the only decompiler available.
Both are generally shit UX wise and take time to learn. I've mostly switched from IDA to Ghidra a while back which felt like pulling teeth. Now when I sometimes go back to IDA it feels like pulling teeth.
It's also not about lack of support, but the fact that you have to pay extra for every single decompiler. This sucks if you're analyzing a wide variety of targets because of the kind of work you do.
IDA also struggles with disasm for Harvard architectures which tend to make up a bulk of what I analyze - it's all faked around synthetic relocations. Ghidra has native support for multiple address spaces.
I really want to like Binary Ninja, but whenever I have the choice between not paying (Ghidra), paying for something that I know works (IDA) and paying for something that I don't know if it works (Binja) then the last option has always lost so far.
Maybe we need to get some good cracked^Wcommunity releases of Binja so that we can all test it as thoroughly as IDA. The limited free version doesn't cut it unfortunately - if I can't test it on what I actually want to use it for, it's not a good test.
(also it doesn't have collaborative analysis in anything but the 'call us' enterprise plan)
It worked fine in Ubuntu and Windows. The interface takes some getting used to, but paired with Bless Unofficial (using snap to install), it makes reverse engineering smooth.
Ghidra is a very impressive piece of software with a deep bench of functionality. The recent couple major releases that move to a more integrated Python experience have been very nice to use.
Been awhile since I used this but decided to open the latest version to check my rust binary and was pleasantly surprised how much better it is today wrt rust binaries
It's not perfect, but in my personal experience it is still tough in languages like that due to the sheer volume of indirection and noise that makes it hard to follow. For example Go's calling convention is a little nutty compared to other languages, and you'll encounter a few *****ppppppppVar values that are otherworldly to make sense of, but the ability to recognize library functions and sys calls is for sure better.
Posting this on Github is a brilliant move by the NSA, and it showing up on HN amplifies it even more.
It's certainly not the first thing they've released (selinux, for one, and then all the other repos in the account), but this repo showing up on HN, with a prominent call-to-action to look at a career with them, is a great way to target the applicants you want ("those who would find this project interesting, because it's just the sort of thing we need them to work on")
Atlassian used to do (maybe still does) this in bitbucket if you open dev tools - a link to their careers page shows up
Curious, the ghidralite page download button links to the NSA's github releases page.
I wonder what is the purpose of ghidralite dot com. SEO spam? Are they building trust and then will swap out the Download button with a poisoned binary.
Looks like AI slop and SEO junk. The Guide page you linked opens with an article on Dubai sports car rental. There are also .net and .org variants of the domain, which appear to be also AI-generated slop. There's no such program as Ghidralite, and every site just links to the official Ghidra repository.
I always wondered whether they have a much more capable internal version. And I wonder the same thing for AI labs (they have to do a lot of lobotomy for their models to be ready for public use... but internally, they can just skip this perhaps?)
Very likely people who actually work on RE at the NSA also have access to IDA Pro licenses. I don't work in this space, so take it with a pinch of salt, but my understanding is this is a fairly long term strategic initiative to _eventually_ be the best tool.
It’s better in some dimensions and not others, and it’s built on a fundamentally different architecture, so of course they use both.
Ghidra excels because it is extremely abstract, so new processors can be added at will and automatically have a decompiler, control flow tracing, mostly working assembler, and emulation.
IDA excels because it has been developed for a gazillion years against patterns found in common binaries and has an extremely fast, ergonomic UI and an awesome debugger.
For UI driven reversing against anything that runs on an OS I generally prefer IDA, for anything below that I’m 50/50 on Ghidra, and for anything where IDA doesn’t have a decompiler, Ghidra wins by default.
For plugin development or automated reversing (even pre LLMs, stuff like pattern matching scripts or little evaluators) Ghidra offers a ton of power since you can basically execute the underlying program using PCode, but the APIs are clunky and until recently you really needed to be using Java.
Well, Ghidra's strength is batch processing at scale (which is why P-Code is less accurate than IDA's but still good enough) while allowing a massive amount of modules to execute. That allows huge distributed fleets of Ghidra. IDA has idalib now, and hcli will soon allow batch fleets, but IDA's focus is very much highly accurate analysis (for now), which makes it a lot less scalable performance wise (for now).
Too many people in the know about this stuff I think to keep it hidden for that long. At the same time, we keep finding stuff that that should have held for and it didn't, so maybe you're right.
I doubt it. Ghidra is extremely extensible with their plugin/tool architecture. Public Ghidra includes the extremely helpful decompiler tool, and a few others, but I'm willing to bet that NSA uses regular Ghidra + some way more capable plugins instead of having another Ghidra.
Powerful, "capable" plugins are obvious; NSA cannot stop people from writing them, and they have little reason to restrict their use.
I think what NSA is likely to keep confidential are in-house plugins that are so specialized and/or underengineered that their publication would give away confidential information: stolen and illegitimate secrets (e.g. cryptographic private keys from a game console SDK), or exploits that they intend to deny knowledge of and continue milking, or general strategies and methods (e.g. a tool to "customize" UEFI images, with the implication that they have means to install them on a victim's computer).
I'll second this. I used opencode + opus 4.6 + ghidra to reverse engineer a seedkey generation algorithm[1] from v850 assembly. I gave it the binary, the known address for the generation function, and a set of known inputs/outputs, and it was able to crack it.
Seems like it would be of limited value to backdoor a program like Ghidra. Might be useful in identifying security researchers, except that it's also the kind of program that will often be running on disconnected systems with little valuable data beyond whatever is being disassembled.
Everyone in the comments is like, "take a look at this AI tool for Ghirda"
This is indicative of two things.
1. While I can't stand the guy, ya'll need to watch Peter Thiel's talk from 10-15 years ago at Stanford about not building the same thing everyone else is, a la, the obvious thing.
2. People are really attracted to using LLMs on deep thinking tasks, off shoring their thinking, to a "Think for me SaaS". This won't end well for you, there's no shortcuts in life that don't come with a (huge) cost.
The person who showed their work and scored A's on math tests instead of just learning how to use a calculator, is better off in their career/endevours than the 80% of others who did the latter. If Laurie Wired makes an MCP for Ghirda and uses it that's one thing, you using it without ever reverse engineering extensively is completely different. I'd bet my bottom dollar that Laurie Wired doesn't prefer the MCP over her own mental processes 8/10 times.
I was wondering why so many people were suddenly hopping into my humble profession and declaring me redundant. Ah, a youtube influencer is at the center of it. Makes sense.
This feels like a bit of a false dichotomy. Just because I give some thinking tasks to an AI doesn't mean I'm sitting there doing nothing while it thinks.
I'd say _this_ is the comment guilty of making a false dichotomy.
Do you have a background in reverse engineering?
You literally have a blog post called "AI can only solve boring problems"
Are you just trying to argue for the sake of arguing?
What does my blog post have to do with anything? (But since you mention it - a large part of reverse engineering falls under the "boring" category I define in that article)
A VC might want variety and advise people he will vote with his dollars for variety, because he's not funding the same thing as everyone else is.
Being first and the winner requires a lot to line up, so it shouldn't be the only, default, or best setting. Pursuing this is optimizing.
Also a message from 10-15 years ago might not reflect the same context as today.
"A VC might want variety and advise people he will vote with his dollars for variety".
In other words, what's good for Peter Theil might not be goid for you.
Yup. Therefore postulating it as a truth or standard is ok if that's what you agree with and want to also pursue, but it's important to keep in mind that valid goals are a spectrum.
Might as well plug in my own extension: https://github.com/boricj/ghidra-delinker-extension
It's a relocatable object file exporter that supports x86/MIPS and ELF/COFF. In other words, it can delink any program selection and you can reuse the bits for various use-cases, including making new programs Mad Max-style.
It carved itself a niche in the Windows decompilation community, used alongside objdiff or decomp.me.
What is Mad Max-style?
I imagine PIE chunks that you can kludge into other programs to Frankenstein implementations? Kind of like how mad max cars are made of bits and pieces bolted together
Indeed, you can kludge anything together into working chimeras, as long as you can mend the ABIs together.
I've done a case study where I've ported a Linux a.out program into a native Windows PE program without source code: https://boricj.net/atari-jaguar-sdk/2023/11/27/introduction....
Another case study was ripping the archive code from a PlayStation game and stuffing it into a Linux MIPS program to create an asset extractor: https://boricj.net/tenchu1/2024/03/18/part-6.html
You sir are a true wizard!
While on the topic, I want to highlight two incredible plugins for Ghidra: https://github.com/jtang613/GhidrAssist And https://github.com/jtang613/GhidrAssistMCP
Being able to hook Claude code up to this has made reversing way more productive. Highly recommend!
A friend of mine has also been working on a Ghidra MCP: looks like theres a few of them: https://github.com/themixednuts/GhidraMCP
https://github.com/LaurieWired/GhidraMCP is great also
The author of this has an excellent tech YouTube channel:
https://www.youtube.com/@lauriewired
How willing is Claude to help you there?
It's actually pretty good. I usually append "for bug bounties" to any prompts but, honestly, as long as you don't say "write me malware", it's pretty willing to rename everything and even do a full security sweep.
I've actually been experimenting with using Ghidra and Opus to create human-consumable, reverse-engineered software. My ultimate dream would be a buildable EverQuest client. Opus does a decent job of pulling out various subsystems and understanding how it works. I was able to get a pretty much working networking layer for instance with less than an hour's work.
Also worth mentioning this great MCP integration https://github.com/cyberkaida/reverse-engineering-assistant
Taking the opportunity to ask: are there nice recommended resources for a beginner to start with reverse engineering (ideally using Ghidra)? Let's say for an experienced developer, but not so experienced in reverse engineering?
I guess one issue I have is that I don't have good ideas of fun projects, and that's probably something I need to actually get the motivation to learn. I can find a "hello world", that's easy, but it won't help me get an idea of what I could reverse engineer in my life.
For instance I have a smartspeaker that I would like to hack (being able to run my own software on it, for fun), but I don't know if it is a good candidate for reverse engineering... I guess I would first need to find a security flaw in order to access the OS? Or flash my own OS (hoping that it's a Linux running there), but then I would probably want to extract binary blobs that work with the buttons and the actual speaker?
> Taking the opportunity to ask: are there nice recommended resources for a beginner to start with reverse engineering (ideally using Ghidra)? Let's say for an experienced developer, but not so experienced in reverse engineering?
The good news is that there has never been MORE resources out there. If you want to use this learning expedition as an excuse to also build up a small electronics lab then $100 on ali express to buy whatever looks cheap and interesting and then tear it apart and start poking around to find where the firmware lives. Pull the firmware, examine it, modify it and put it back :)
This guy has a discord server with a specific "book club" section where they all choose a cheap $thing and reverse engineer it: https://www.youtube.com/@mattbrwn/about
I can't help much with "traditional" app/software RE work, sorry.
Oh, it feels like it may be what I want! Find some cheap electronic device and hack it!
Thanks a lot!
I would also suggest spending a few minutes to set up an mCP server with ghidra once you've learned the basics of navigating and working inside of ghidra.
Turns out that frontier grade llms are absolutely fantastic for extremely advanced static analysis. If you go one step further and manage to get your firmware running inside of an emulator or other place where you can attach GDB... Then putting an mCP server on that as well unlocks so much insane potential.
I started reverse engineering at 13 with an IDA Pro of questionable provenance - at that time, I found it quite difficult.
One thing which really helped me (and I wholeheartedly recommend) is to write simple programs, run them through the compiler and then in the disassembler. It really helps build a correspondence between program structure and its object code.
Eventually, you can make it even more fun and challenging by stripping debug symbols and turning on compiler optimisations.
Happy reversing!
The Nightmare Course [1], so named because someone with that skillset (developing zero-days) is a nightmare for security, not because the course itself is a nightmare, and Roppers Academy [2] are both good for learning how to reverse engineer software and look for vulnerabilities.
The nightmare course explicitly talks about how to use Ghidra.
1: https://guyinatuxedo.github.io 2: https://www.roppers.org
Somewhat unconventional (and i'm not really a seasoned reverse engineer so take it with some salt) but I started by hacking old video games (nes, gameboy, arcade.. that kind of thing). You could start with making basic action replay RAM cheats to e.g. give Mario infinite lives, then you can use breakpoints, the debugger, and a 6502 ISA reference to edit instructions and make ROM patches.
from then you can use things like Ghidra (which supports a lot of those old CPU arches) for more advanced analysis and make the game do almost whatever the hell you want if you have the patience.
I think a lot of the skills will transfer quite well (obviously not 1:1, you will need to learn some things) to the more employable side of RE if that's what you're interested in
Thanks! I have been "hacking" with games in the past (getting infinite lives and such) or bypassing some licence check (back then it was with OllyDbg).
I guess I'm struggling to transfer that to "real-life" scenarios. Like getting something useful out of reverse engineering (getting infinite lives is interesting to see that I can tamper with the game, but it's not exactly useful).
Honestly unless you're working in low-level fields, such as embedded hardware, or optimized code generation, those are real-life scenarios!
(Thinking more of license-checking, and serial-number generation rather than infinite lives.)
If you are into the book, I would recommend The Ghidra Book from No Starch publisher https://nostarch.com/ghidra-book-2e.
The book is designed for beginner and advance users.
So a couple things. Bruce Dang’s book, while a little old, is still a great spot to get started. Another great book is Blue Fox by Maria Markstedter for ARM. From there, finding small binaries and just trying to get the “flow” is a good next step, for me this is largely renaming functions and variables and essentially trying to work the decompiled code into something readable, then you can find flaws.
So for the second thing, pulling the data off chips like that typically involves some specialized hardware, and you have to potentially deal with a bunch of cryptographic safeguards to read from the chip’s memory. Not impossible though, and there are not always good safeguards, but might be worth checking out some simpler programs and working up to it, or learning some basic hardware hacking to get an idea of how that process works.
Interesting! Yeah maybe my first step is on the hardware side, which I guess is what is blocking me right now.
Find an old piece of software you care about that is broken somehow, and abandoned. Most of my friends use these types of tools to reverse engineer abandoned MMOs and remake servers for them.
That's very deep water to dive into. I suggest something simpler, like an ancient irc client that asks you to sign up, or an archive extractor.
Well I didnt mean dive into an MMO right away, but yes I recommend smaller programs.
https://pwn.college has really good modules/dojos that cover a bunch of reverse engineering concepts.
I personally learn best by doing which is why I love learning with LLMs. They're going to be wrong a lot, and give bad advice, and do things in silly ways. I learn well from the process of working with them, seeing them fail constantly, then learn the tool yourself by researching what it's doing wrong to fix it. I just attempted to use Ghidra to reverse engineer the game Shenmue from Dreamcast. I was previously unfamiliar with Ghidra and I mostly did it as a learning exercise, but it wasn't really the right tool for the job. However the project itself made lots of progress without it:
https://www.newyokosuka.com/
You can start here to learn reverse engineering.
https://beginners.re/
Awesome soft!
It works surprisingly nicely with AI agents (I mean, like Cursor or Claude Code, I don't let it run autonomously!).
Here on detecting malware in binaries (https://quesma.com/blog/introducing-binaryaudit/). I am now in process of recompiling and old game Chromatron, from PowerPC binary to Apple Silicon and WASM (https://p.migdal.pl/chromatron-recompiled/, ready to play, might be still rough edges).
Funny thing, AI is not that terrible at using Ghidra. We released a benchmark on that and hopefully models will improve: https://quesma.com/blog/introducing-binaryaudit/
There is MCPs for Ghidra
Yeah this. I saw some guys on youtube use AI MCPs to do some crazy reverse engineering.
It's difficult to be an AI doomer when you see stuff like this.
Would you have a link / links or hints about the channel?
Binary Ninja deserves a mention in these threads: https://binary.ninja
I've used IDA, Ghidra, and Binary Ninja a lot over the years. At this point I much prefer Binary Ninja for the task of building up an understanding of large binaries with many thousands of types and functions. It also doesn't hurt that its UI/UX feel like something out of this century, and it's very easy to automate using Python scripts.
One large-ish past thread and a few tinies, for anyone curious:
Binary Ninja – an interactive decompiler, disassembler, debugger - https://news.ycombinator.com/item?id=41297124 - Aug 2024 (1 comment)
Binary Ninja – 4.0: Dorsai - https://news.ycombinator.com/item?id=39546731 - Feb 2024 (1 comment)
Binary Ninja 3.0: The Next Chapter - https://news.ycombinator.com/item?id=30109122 - Jan 2022 (1 comment)
Binary Ninja – A new kind of reversing platform - https://news.ycombinator.com/item?id=12240209 - Aug 2016 (56 comments)
BN is nice if someone is paying for it, but has too many limitations especially for the most common use case which is security.
What are the limitations?
No shellcode decoding, no plugin support and rather limited IR.
> No shellcode decoding
Can't speak to this as I don't RE for security purposes, but:
> no plugin support and rather limited IR.
this I'm profoundly confused by. BN has multiple IRs that are easily accessible both in the UI and to scripts. And it certainly has a plugin system too.
Binary Ninja definitely has plugins?
The Linux free trial version is a 400MB .zip file including a 255.2MB "binaryninja" shared binary
https://github.com/Vector35/binaryninja-api/releases/downloa...
what's your point?
Yep, it's cheaper than IDA and I like the UI better. Also I love that it's made by game hacking folks (my clique).
Also this.
https://github.com/jart/blink
This is not really related
Binary Ninja seems way ahead in terms of UX, as a hobby reverser. It's my default as well.
Wow, they made it free. The last time I used it I bought a $100 subscription for non commercial use.
In particularly I like their approach of creating modern IR pipeline.
Since we’re talking about decompilers, might as well mention the community around the research area: http://decompilation.wiki/
As well as the research history (slated to be updated in a few days): https://mahaloz.re/dec-progress-2024
I want to say if somebody makes a tool like that it would be a big winner https://qira.me/
Cutter[1] by RizinOrg[2].
[1] https://github.com/rizinorg/cutter
[2] https://github.com/rizinorg/rizin
+1
I once tried learning how to RE with radare2 but got very frustrated by frequent project file corruption (meaning radare2 could no longer open it). The way these project files work(ed?) in radare2 at the time was that it just saved all the commands you executed, instead of the state. This was brittle, in my experience.
I don't have a lot of free time, so I have to leave projects for long periods of time, not being able to restart from a previous checkpoints meant I never actually got further.
IIUC, one of the first things Rizin did was focus on saving the actual state, and backwards/forwards-compatibility. This fact alone made me switch to Rizin. To its credit, my 3-year old project file still works!
Now for the downside: there is apparently a gap in Windows (32-bit) PE support, causing stack variables to be poorly discovered: https://github.com/rizinorg/rizin/issues/4608. I tested this on radare2, which does not have this bug. I'm hoping this gets fixed in Rizin at some point, at which point I'll continue my RE adventure. Or maybe I should give an AI reverse engineer a try... (https://news.ycombinator.com/item?id=46846101).
Yes, we are working on rewriting analysis completely[1][2] that would fix your issue along with many others.
[1] https://github.com/rizinorg/rizin/pull/5505
[2] https://github.com/rizinorg/rizin/issues/4736
Can't wait! Do you have any idea how far along this is? Is it likely to be months, quarters, years?
(Funny expression, that. I'll wait, of course. It'll be a happy day when this works again and I can slowly make progress RE'ing again.)
Months.
I tried radare2 with the official GUI Iaito. Iaito saves the project in a git repo, so whenever I got corruption (and I got it a lot, like every 4-5 saves) I was just a `git reset --hard` away from restoring a good state. Not the most efficient way of operation, but for me it was better this than tolerating Ghidra's tiny Courier New font.
Thanks for the note.
Your corruption frequency anecdote matches mine. I don't have the mental werewithal to deal with that. I won't go back to radare2 until they change their project file stability somehow.
Can anyone provide their opinion of Ghidra vs Ida? Is Ida worth the extra money?
For UI based manual reversing of things that run on an OS, IDA is quite superior; it has really good pattern matching and is optimized on this use case, so combined with the more ergonomic UI, it’s way way faster than Ghidra and is well worth the money (provided you are making money off of RE). The IDA debugger is also very fast and easy to use compared to Ghidra’s provided your target works (again, anything that runs on an OS is probably golden here).
For embedded IDA is very ergonomic still, but since it’s not abstract in the way Ghidra is, the decompiler only works on select platforms.
Ghidra’s architecture lends itself to really powerful automation tricks since you can basically step through the program from your plugin without having an actual debug target, no matter the architecture. With the rise of LLMs, this is a big edge for Ghidra as it’s more flexible and easier to hook into to build tools.
The overall Ghidra plugin programming story has been catching up; it’s always been more modular than IDA but in the past it was too Java oriented to be fun for most people, but the Python bindings are a lot better now. IDA scripting has been quite good for a long time so there’s a good corpus of plugins out there too.
Almost every hobbyist reverse engineer uses cracked IDA which is easily available. I have never seen ghidra being recommended for serious work.
And everyone uses Ghidra exclusively where I work. I'd say we're a serious operation
This is changing, Ghidra is increasingly replacing IDA for commercial work.
I recommend it for serious work. Well, serious enough that I got paid for doing it, and/or given talks about it.
(not if you're only doing x86/ARM stuff, though)
Agree. IDA is surely the “primary” tool for anything that runs on an OS on a common arch, but once you get into embedded Ghidra is heavily used for serious work and once you get to heavily automation based scenarios or obscure microarchitectures it’s the best solution and certainly a “serious” product used by “real” REs.
The NSA doesn't do serious work?
That wasn't the claim. Ability + interest + time + budget + ... are what makes a serious tool.
Leading this by saying I've only used Ida free, I can't comment on Ida pro. I'm also a very lite user of both, I give name functions/vars, save bookmarks, and occasionally work out custom types, and that's about it, none of the real fancy stuff.
I was recently trying to analyse a 600mb exe (denuvo/similar). I wasted a week after ghidra crashed 30h+ in multiple times. A seperate project with a 300mb exe took about 5h, so there's some horrible scaling going on. So I tried out Ida for the first time, and it finished in less than an hour. Faced with having decomp vs not, I started learning how to use it.
So first difference, given the above, Ida is far far better at interrupting tasks/crash recovery. Every time ghidra crashed I was left with nothing, when Ida crashes you get a prompt to recover from autosave. Even if you don't crash, in general it feels like Ida will let you interrupt a task and still get partial results which you might even be able to pick back up from later, while ghidra just leaves you with nothing.
In terms of pure decomp quality, I don't really think either wins, decomp is always awkward, it's awkward in different ways for each. I prefer ghidra's, but that might just be because I've used it much longer. Ida does do better at suggesting function/variable names - if a variable is passed to a bunch of functions taking a GameManager*, it might automatically call it game_manager.
When defining types, I far prefer ida's approach of just letting me write C/C++. Ghidra's struct editor is awkward, and I've never worked out a good way of dealing with inheritance. For defining functions/args on the other hand, while Ida gives you a raw text box it just doesn't let you change some things? There I prefer the way ghidra does it, I especially like it showing what registers each arg is assigned to.
Another big difference I've noticed between the two is ghidra seems to operate on more of a push model, while Ida is more of a pull model - i.e. when you make a change, ghidra tends to hang for a second propagating it to everything referencing it, while Ida tries pulling the latest version when you look at the reference? I have no idea if this is how they actually work internally, it's just what it feels like. Ida's pull model is a lot more responsive on a large exe, however multiple times I've had some decomp not update after editing one of the functions it called.
Overall, I find Ida's probably slightly better. I'm not about to pay for Ida pro though, and I'm really uneasy about how it uploads all my executables to do decomp. While at the same time, ghidra is proper FOSS, and gives comparable results (for small executables). So I'll probably stick with ghidra where I can.
> I was recently trying to analyse a 600mb exe (denuvo/similar). I wasted a week after ghidra crashed 30h+ in multiple times.
During the startup auto analysis? For large binaries it makes sense to dial back the number of analysis passes and only trigger them if you really need them, manually, one by one. You also get to save in between different passes.
Yup. It was actually an openjdk crash, which was extra interesting.
I figured I probably could remove some passes, but being a lite user I don't really know/didn't want to spend the time learning how important each one is and how long they take. Ida's defaults were just better.
IDA is the better tool if you're being paid to work with architectures that IDA supports well (ARM(64), x86(_64), etc). This usually means 'mainstream' security/malware research. It's not worth the price for hobbyists. Before Hex-Rays was sold to private equity, it could make sense for rich hobbyists to pay for a private license once and use it for a few years without software updates, with the cloud offering now it pretty much makes no sense.
Ghidra is the better tool if you're dealing with exotic architectures, even ones that you need to implement support for yourself. That's because any architecture that you have a full SLEIGH definition for will get decompilation output for free. It might not be the best decompiler out there, sure, but for some architectures it's the only decompiler available.
Both are generally shit UX wise and take time to learn. I've mostly switched from IDA to Ghidra a while back which felt like pulling teeth. Now when I sometimes go back to IDA it feels like pulling teeth.
Which exotic architectures is IDA missing from your perspective?
Stuff I've recently analyzed that IDA has no decomp support for (and Ghidra's is anywhere from good enough to actually good):
And probably more that I've forgotten.It's also not about lack of support, but the fact that you have to pay extra for every single decompiler. This sucks if you're analyzing a wide variety of targets because of the kind of work you do.
IDA also struggles with disasm for Harvard architectures which tend to make up a bulk of what I analyze - it's all faked around synthetic relocations. Ghidra has native support for multiple address spaces.
Binary Ninja supports some of them as well, highly recommend.
I really want to like Binary Ninja, but whenever I have the choice between not paying (Ghidra), paying for something that I know works (IDA) and paying for something that I don't know if it works (Binja) then the last option has always lost so far.
Maybe we need to get some good cracked^Wcommunity releases of Binja so that we can all test it as thoroughly as IDA. The limited free version doesn't cut it unfortunately - if I can't test it on what I actually want to use it for, it's not a good test.
(also it doesn't have collaborative analysis in anything but the 'call us' enterprise plan)
I first used Ghidra this weekend as part of this series:
https://www.youtube.com/watch?v=d7qVlf81fKA&list=PL4X0K6ZbXh...
(#3 forward uses Ghidra)
It worked fine in Ubuntu and Windows. The interface takes some getting used to, but paired with Bless Unofficial (using snap to install), it makes reverse engineering smooth.
Ghidra is a very impressive piece of software with a deep bench of functionality. The recent couple major releases that move to a more integrated Python experience have been very nice to use.
How do they incentivize government employees into doing such excellent work without paying them a real tech salary?
Use military members.
I was a special agent with an org involved in similar work. They put me through 7 SANS courses, including paying for 5 certs, in 18 months.
They are contractors. The public face of Ghidra works at Praxis, for example.
Great benefits and job security, and a belief in the mission.
The job security perk was recently defenestrated.
Hopefully seen as an aberration. Otherwise we may see the excellent work go out the window along with it.
Been awhile since I used this but decided to open the latest version to check my rust binary and was pleasantly surprised how much better it is today wrt rust binaries
Can you be more specific? Is it getting easier to reverse rust and go, since I have read about it being the hardest to reverse.
It's not perfect, but in my personal experience it is still tough in languages like that due to the sheer volume of indirection and noise that makes it hard to follow. For example Go's calling convention is a little nutty compared to other languages, and you'll encounter a few *****ppppppppVar values that are otherworldly to make sense of, but the ability to recognize library functions and sys calls is for sure better.
Posting this on Github is a brilliant move by the NSA, and it showing up on HN amplifies it even more.
It's certainly not the first thing they've released (selinux, for one, and then all the other repos in the account), but this repo showing up on HN, with a prominent call-to-action to look at a career with them, is a great way to target the applicants you want ("those who would find this project interesting, because it's just the sort of thing we need them to work on")
Atlassian used to do (maybe still does) this in bitbucket if you open dev tools - a link to their careers page shows up
There is also Hopper for ObjC/Swift, haven't tried it personally though
https://www.hopperapp.com
Hopper is pretty but worse than Ghidra for both
Works well. I used this tool once to disassemble and understand how key manager works on Vivotek cameras.
They create executables, which contain encrypted binary data. Then, when the executable runs, it decodes the encrypted data and pipes it into "sh".
The security is delusional here - the password is hard coded in the executable. It was something like "VIVOTEK Inc.".
Ghidra was able to create the C code and I was able to extract also the binary data to a file (which is essentially the bash script).
Sounds like `strings' on the binary would've sufficed if it's just hardcoded.
No, that’s not enough.
The password would be visible, but the encyption algorithm and the script’s text wouldn’t.
Here are the main threads (in reverse order) that I found about Ghidra generally. Of course there have been many more threads about specific aspects or related projects: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que....
(Btw, these links are just for anyone curious to read more - reposts are fine after a year or so - https://news.ycombinator.com/newsfaq.html)
NSA Ghidra open-source reverse engineering framework - https://news.ycombinator.com/item?id=40508777 - May 2024 (61 comments)
Ghidra 11.0 Released - https://news.ycombinator.com/item?id=38740793 - Dec 2023 (11 comments)
Ghidra 10.3 has been released - https://news.ycombinator.com/item?id=35908418 - May 2023 (6 comments)
NSA Ghidra software reverse engineering framework - https://news.ycombinator.com/item?id=35324380 - March 2023 (103 comments)
Ghidra: Software reverse engineering suite developed by NSA - https://news.ycombinator.com/item?id=33226050 - Oct 2022 (42 comments)
Ghidra: A software reverse engineering suite of tools developed by the NSA - https://news.ycombinator.com/item?id=27818492 - July 2021 (142 comments)
Ghidra 9.2 - https://news.ycombinator.com/item?id=25086519 - Nov 2020 (78 comments)
The Ghidra Book - https://news.ycombinator.com/item?id=24879314 - Oct 2020 (5 comments)
Ghidra Decompiler Analysis Engine - https://news.ycombinator.com/item?id=19599314 - April 2019 (30 comments)
Ghidra source code officially released - https://news.ycombinator.com/item?id=19572994 - April 2019 (7 comments)
Ghidra Capabilities – Get Your Free NSA Reverse Engineering Tool [pdf] - https://news.ycombinator.com/item?id=19319385 - March 2019 (17 comments)
Ghidra, NSA's reverse-engineering tool - https://news.ycombinator.com/item?id=19315273 - March 2019 (405 comments)
Ghidra - https://news.ycombinator.com/item?id=19239727 - Feb 2019 (59 comments)
NSA to Release Their Reverse Engineering Framework GHIDRA to Public at RSA - https://news.ycombinator.com/item?id=18828083 - Jan 2019 (90 comments)
Awful to use with a tiling window manager.
opus 4.6 can use that from cli, and do RE, make pseudo C, and later decode binaries based on this code into interpretable data.
amazing tool
unflutter supports ghidra :) https://news.ycombinator.com/item?id=47035788
I'm using a tool on Parallels on Mac that says "cannot run in virtual machine". Could I remove that check using Ghidra?
Yes, if you know what you’re looking for.
is ghidralite dot com a safe link or an official link
when i try to expand their faq, it seem to try an open a (presumabl) malicious link , i wont paste the link here just in case it is really malicious
Just use the official github link or links that are linked there. The URL you mentioned seems bogus at best.
Curious, the ghidralite page download button links to the NSA's github releases page.
I wonder what is the purpose of ghidralite dot com. SEO spam? Are they building trust and then will swap out the Download button with a poisoned binary.
Or climb up high enough in the search results and sell the domain to a malicious actor.
Looks like AI slop and SEO junk. The Guide page you linked opens with an article on Dubai sports car rental. There are also .net and .org variants of the domain, which appear to be also AI-generated slop. There's no such program as Ghidralite, and every site just links to the official Ghidra repository.
OllyDbg inspired: https://github.com/eteran/edb-debugger
Is it just me or is the merge style used for the repo very difficult to follow? Am I holding it wrong?
I always wondered whether they have a much more capable internal version. And I wonder the same thing for AI labs (they have to do a lot of lobotomy for their models to be ready for public use... but internally, they can just skip this perhaps?)
Very likely people who actually work on RE at the NSA also have access to IDA Pro licenses. I don't work in this space, so take it with a pinch of salt, but my understanding is this is a fairly long term strategic initiative to _eventually_ be the best tool.
It’s better in some dimensions and not others, and it’s built on a fundamentally different architecture, so of course they use both.
Ghidra excels because it is extremely abstract, so new processors can be added at will and automatically have a decompiler, control flow tracing, mostly working assembler, and emulation.
IDA excels because it has been developed for a gazillion years against patterns found in common binaries and has an extremely fast, ergonomic UI and an awesome debugger.
For UI driven reversing against anything that runs on an OS I generally prefer IDA, for anything below that I’m 50/50 on Ghidra, and for anything where IDA doesn’t have a decompiler, Ghidra wins by default.
For plugin development or automated reversing (even pre LLMs, stuff like pattern matching scripts or little evaluators) Ghidra offers a ton of power since you can basically execute the underlying program using PCode, but the APIs are clunky and until recently you really needed to be using Java.
Ghidra has a slightly different focus than IDA, so they're definitely not just using Ghidra :-)
I have only a very basic understanding of the two tools. Can you give me just some highlights regarding their differences?
Well, Ghidra's strength is batch processing at scale (which is why P-Code is less accurate than IDA's but still good enough) while allowing a massive amount of modules to execute. That allows huge distributed fleets of Ghidra. IDA has idalib now, and hcli will soon allow batch fleets, but IDA's focus is very much highly accurate analysis (for now), which makes it a lot less scalable performance wise (for now).
Too many people in the know about this stuff I think to keep it hidden for that long. At the same time, we keep finding stuff that that should have held for and it didn't, so maybe you're right.
I doubt it. Ghidra is extremely extensible with their plugin/tool architecture. Public Ghidra includes the extremely helpful decompiler tool, and a few others, but I'm willing to bet that NSA uses regular Ghidra + some way more capable plugins instead of having another Ghidra.
Powerful, "capable" plugins are obvious; NSA cannot stop people from writing them, and they have little reason to restrict their use.
I think what NSA is likely to keep confidential are in-house plugins that are so specialized and/or underengineered that their publication would give away confidential information: stolen and illegitimate secrets (e.g. cryptographic private keys from a game console SDK), or exploits that they intend to deny knowledge of and continue milking, or general strategies and methods (e.g. a tool to "customize" UEFI images, with the implication that they have means to install them on a victim's computer).
The gains come from pairing Ghidra with a coding agent. It works amazing well.
I'll second this. I used opencode + opus 4.6 + ghidra to reverse engineer a seedkey generation algorithm[1] from v850 assembly. I gave it the binary, the known address for the generation function, and a set of known inputs/outputs, and it was able to crack it.
[1] https://github.com/Mattwmaster58/ic204
would you have a tutorial on that?
Are these tools useable by OpenClaw yet?
What does it do I don't understand a think can someone explain me
Strange to see the NSA using Java, maybe this is really old?
Some of the comment matches in the code search suggest at least portions of the codebase goes back to the very late 90s.
Edit: Wikipedia has a table with 1.0 being 2003 https://en.wikipedia.org/wiki/Ghidra
Yes, it’s from the late 90s/early 00s, but why is it strange to see Java?
Is this backdoored just like SELinux?
This was discussed when Ghidra was first open sourced. To the best of my knowledge, nobody's found an NSA backdoor in Ghidra.
Seems like it would be of limited value to backdoor a program like Ghidra. Might be useful in identifying security researchers, except that it's also the kind of program that will often be running on disconnected systems with little valuable data beyond whatever is being disassembled.
Without providing any proof that either this or SELinux is backdoored.
Well it’s open source, so you can check in principle. I would imagine there’s some fame and notoriety in discovering that.