3 results (0.024 seconds)
Believing that LLM-enabled tools can play a role in solving this, I've built a tool that automatically generates documentation for legacy codebases using the Model Context Protocol (MCP) & Claude Sonnet. At first glance, I think this approach has merit. Some samples are in the README. I welcome your thoughts.
The Problem:
- Legacy codebases are notoriously difficult to understand and navigate
- Onboarding new developers takes months
- Making changes safely requires deep knowledge of the system
- Business stakeholders lack visibility into system architecture
The Solution - an MCP-based tool that:
- Scans your codebase
- Generates README files at each level of the directory structure
- Creates C4 architecture diagrams showing system components and relationships
- Builds a complete documentation hierarchy from high-level architecture to implementation details
The tool aims to helps teams:
- Onboard developers faster with clear system documentation
- Make changes confidently with better understanding of components
- Communicate system architecture to stakeholders
- Maintain living documentation that evolves with the codebase
Have a look / try it out!
GitHub: https://github.com/jonverrier/McpDoc
License: MIT
To credit various other similar works:
https://news.ycombinator.com/item?id=43154065 (
jtwaleson's post)
https://news.ycombinator.com/item?id=42521769
https://news.ycombinator.com/item?id=41393458
I’ve been bouncing between LLMs for "deep research" tasks lately — trend discovery, digging into niche use cases, finding the right community for topic <X>. Reddit usually has the best signal for these, but getting reliable results from an LLM is tricky.
A few recurring issues:
- Hallucinated subreddits: even with simple questions like “What’s the best subreddit for <X>?”, models often return made-up names.
- Knowledge cutoffs: LLMs miss recent or trending content and topics, even when it’s easy to find on Reddit (try asking Claude what MCP is…)
- Slow web tools: ChatGPT’s “Deep Research” and similar products can work to get up to date info, but they’re slow — often taking 5+ minutes to answer anything but the most trivial queries.
To make this faster and more consistent, I built a plug-and-play MCP server that connects directly to the Reddit API. It exposes a small set of structured tools (search, fetch, browse) that LLMs can use to retrieve Reddit posts and comments in real time. It’s read-only for now to prevent any unintended side effects, but I might add write access in the future.
Here’s a demo showing a request that a regular LLM without any tools simply could not complete at all, and a tool like ChatGPT Deep Research will take >5 mins and return worse results: asking the model to find highly upvoted comments in niche LLM subreddits. Demo video (sped up ~3x to reduce file size): https://github.com/user-attachments/assets/a2e9f2dd-a9ac-453...
The model runs a toolchain with a set of structured queries first to identify “niche” subreddits, then looks for promising posts and comments, and iterates on this until it thinks it’s satisfied the original request. It then returns a grounded answer — all in <2.5 minutes.
You can run this locally with Claude Desktop or any other MCP client (I actually use it with AutoGen -- example in the repo) — just plug in your Reddit API keys and go. There’s more in the README if you want to try it out, and I’m happy to provide any help as the MCP ecosystem is… not mature yet. Despite being a professional software engineer, building an MCP server was a total PITA so happy to pay it forward and help anyone in need of some guidance for anything they’re building.
Would love feedback — especially if you’ve been working on research tools, using MCP in novel ways, have a use case for this, or things you want to see in this space.