As an experienced backend engineer, I have been trying to leverage Cursor to explore the potential of coding agent. The experiment (Build a toy database from scratch) is ranging from Dec 2024 to March 2025, 32 commits, and thousands of lines of code, here are the three lessons I learnt from it.
TLDR: Coding agent is a capable peer engineer you can leverage as every part of the SDLC, but intentionally.
#1 Context is key
Building a database from a high level can have different approaches, from the high level to the detailed implementation, each context could be different, feeding in right amount of context for the coding agent is critical. The engineer is capable at every part of system from architecture to coding, but you just need to treat him as a buddy to brainstorm, communicate, and think together.
At a high level, choose your high level mind first. Building a database could be relational, key-value, inverted search, document-based, or vertor-based etc. It is easy for the coding agent to generate a framework code for you, but you will need to determine which architecture style you would like to go with. In my case, I go with relational, and asked Cursor to stick to the postgres implementation by defining a cursor rule.
At component level, try to setup a minimal structure to get started and iterate incrementally. A database is composed by storage, index, query, parser, execution, transaction, maintenance, admission control etc. It can easily be overwhelmed for the coding agent to copy other implementation and overwhelm your mind and project. Try to start small, I started with a memory-based storage, and then focusing on query engine, and index later. I had tried to go faster and let the coding agent do everything, but then I have to revert and take a step back, simply because the project is out-of-control.
At a detailed level, implementing a specific algorithm can have different styles, for example: the MVCC (multiple-version concurrency control) can have different way of implementation, try to brainstorm with the coding agent and think together which way is the best, and then choose to go with the path. For example, I brainstormed with the agent on Version Chain, but the version chain could be implemented at the tuple, or a separate data structure, but the original storage layer is better suited with the tuple level, it make the coding implementation easier.
No comments:
Post a Comment