The data layer is the best predictor of product velocity

Operators and builders of early or mid-stage startups know that product velocity is the only thing they can control. If you can't build and iterate quickly, then someone else who can will show up and beat you at your own game. Investors know this too, and they attempt to derisk their investments by betting on teams who have a unique advantage to move quickly. This is not a chicken and egg problem - dumping money into a team that can't move quickly is like adding gasoline to car with a leak in the tank. So the question is how can engineering teams increase their capital efficiency? How many features and how many bugs are produced per dollar injected into an engineering team? The answer starts with the data layer.

Let the knife do the work

A vast majority of startups use a relational database like MySQL or Postgres to store their data and the patterns used to access and manipulate that data is what the data layer references. This is the foundation of many software applications and, as with any other part of a codebase, the goal here is to build quickly while limiting the chance of introducing bugs along the way. Typically those are opposing forces, but using an ORM like Prisma in the data layer is a bit of a cheat code.

An ORM, or Object Relational Mapper, is a tool that allows you to interact with your database using the language of your choice. For example, if you are using Prisma, then you can write your queries in Typescript and Prisma will translate those queries into SQL for you. This is a huge win for teams because it allows them to write queries in a language that they are familiar with while also allowing them to leverage the power of SQL. Without using an ORM, you would need to write your queries in raw SQL and then use a package like mysql2 to execute those queries. Below is an example of all of the code that you you would need to write to interact with a database that manages data for a simple app that allows users to make posts, collaborate on posts, and like other user's posts.

function getUserById(id: number): User {
        const user = await db.query('SELECT * FROM users WHERE id = ?', [id]);
        return user;
}

function getUsersPosts(userId: number): Post[] {
        const posts = await db.query('SELECT * FROM posts WHERE userId = ?', [userId]);
        return posts;
}

function getUsersCollaboratedPosts(userId: number): Post[] {
        const posts = await db.query('SELECT * FROM posts WHERE id IN (SELECT postId FROM postCollaborators WHERE userId = ?)', [userId]);
        return posts;
}

function getUsersLikedPosts(userId: number): Post[] {
        const posts = await db.query('SELECT * FROM posts WHERE id IN (SELECT postId FROM postLikes WHERE userId = ?)', [userId]);
        return posts;
}

function getLikesForPost(postId: number): PostLike[] {
        const likes = await db.query('SELECT * FROM postLikes WHERE postId = ?', [postId]);
        return likes;
}

function getCollaboratorsForPost(postId: number): PostCollaborator[] {
        const collaborators = await db.query('SELECT * FROM postCollaborators WHERE postId = ?', [postId]);
        return collaborators;
}

function createUser(name: string, email: string, phone: string): User {
        const user = await db.query('INSERT INTO users (name, email, phone) VALUES (?, ?, ?)', [name, email, phone]);
        return user;
}

function createPost(title: string, body: string, userId: number): Post {
        const post = await db.query('INSERT INTO posts (title, body, userId) VALUES (?, ?, ?)', [title, body, userId]);
        return post;
}

function createPostCollaborator(postId: number, userId: number): PostCollaborator {
        const collaborator = await db.query('INSERT INTO postCollaborators (postId, userId) VALUES (?, ?)', [postId, userId]);
        return collaborator;
}

function createPostLike(postId: number, userId: number): PostLike {
        const like = await db.query('INSERT INTO postLikes (postId, userId) VALUES (?, ?)', [postId, userId]);
        return like;
}

function deleteUser(userId: number): void {
        await db.query('DELETE FROM users WHERE id = ?', [userId]);
}

function deletePost(postId: number): void {
        await db.query('DELETE FROM posts WHERE id = ?', [postId]);
}

function deletePostCollaborator(postId: number, userId: number): void {
        await db.query('DELETE FROM postCollaborators WHERE postId = ? AND userId = ?', [postId, userId]);
}

function deletePostLike(postId: number, userId: number): void {
        await db.query('DELETE FROM postLikes WHERE postId = ? AND userId = ?', [postId, userId]);
}

function updateUser(userId: number, name: string, email: string, phone: string): User {
        const user = await db.query('UPDATE users SET name = ?, email = ?, phone = ? WHERE id = ?', [name, email, phone, userId]);
        return user;
}

function updatePost(postId: number, title: string, body: string): Post {
        const post = await db.query('UPDATE posts SET title = ?, body = ? WHERE id = ?', [title, body, postId]);
        return post;
}

function updatePostCollaborator(postId: number, userId: number): PostCollaborator {
        const collaborator = await db.query('UPDATE postCollaborators SET postId = ?, userId = ? WHERE postId = ? AND userId = ?', [postId, userId, postId, userId]);
        return collaborator;
}

function updatePostLike(postId: number, userId: number): PostLike {
        const like = await db.query('UPDATE postLikes SET postId = ?, userId = ? WHERE postId = ? AND userId = ?', [postId, userId, postId, userId]);
        return like;
}

That is just the tip of the iceberg. We would also need to need to create types for all of the tables, write tests for all of those functions, and then maintain all of that code as the app grows.

How does growth look?

The truth is that the data layer really only predicts product velocity in startups that are growing and have a long arc of fast-paced engineering. The example code above looks clean and simple, but the codebase now compared to 1 year from now for a lean and agressive startup would look expoentially different if they went down that path. The size of the codebase will have grown significantly, and increasing the surface area inevitably increases the chances of introducing bugs. New engineers will join the team with their own styles and, even if they copy/paste the code for their new features, small tweaks will cause patterns to evolve over time. This is where the data layer becomes a liability. The data layer probably contains the most reused code in any codebase, so any hiccup in the data layer will likely have multiple downstream effects.

The sales team feedback loop

The engineering team can only release so many bugs before the sales and customer success teams start to feel the pain. Customers are paying for a product that provides consistent value and, given that bugs in the data layer often have some of largest blast radius, there will only need to be a few of them before a larger solution is in order. This typically means

  1. Building out a more comprehensive test suite
  2. Adding more monitoring or error detection tools to the codebase
  3. Implementing a more robust code review process as a part of the engineering team culture

All of these are great additions to engineering team and they will undoubtibly lead to a more stable product, but this is a big step in the wrong direction when resources are limited. The engineering team is now spending more time on things that don't directly add value. The engineers will are now working on projects that they don't particularly enjoy and are increasing the surface area of their system while they are doing it.

Product velocity

We've arrived in a place where early choices in the design of the data layer have had a significant impact on the product velocity of the company. It's a typically a good idea to deligate responsibility whenever possible within a tech stack. For the same reason as you should use SendGrid instead of building your own email server, you should use a tool like Prisma instead of building your own data layer. There is always a time and place to be creative and write code for a system in-house, but choosing to do that for the most fundamental level of a product will inenvitably slow development down regardless of how skilled the engineers are. Resources are scarce for early stage startups and a well architected data layer will ensure that they spent wisely.