Skip to main content

Why SQL Formatters Disagree on Keyword Case and Indent Style

Rows of database server racks in a data center
Try the Tool
SQL Formatter
Format and beautify SQL queries instantly

Paste the same query into three different SQL formatters and you'll get three different answers. One uppercases every keyword. Another leaves them alone. One puts commas at the start of the line, another at the end, and a third doesn't touch commas at all. If you've ever wondered whether you're using the formatter wrong, you're not. There is no single agreed-upon SQL style, and that's the actual root of the confusion.

Unlike languages such as Python, where PEP 8 gives the ecosystem one dominant convention almost everyone defers to, SQL has never had that moment. It's older than most style guides, it's implemented across a dozen incompatible dialects, and different database vendors and BI tools each grew their own habits over decades. A formatter has to pick a side, and every formatter picks a slightly different one.

Keyword case is a legacy argument, not a technical one

The uppercase-keyword convention (SELECT, FROM, WHERE) goes back to an era when SQL editors had no syntax highlighting. Capitalizing SELECT and FROM was the only way to visually separate the language's structure from your table and column names. It worked, and it stuck, especially in enterprise shops and older DBA-authored style guides.

colorful syntax highlighting on a text editor Photo by Markus Spiske on Pexels

Modern editors don't need that crutch. Syntax highlighting already colors keywords differently from identifiers, so a lot of newer teams write select, from, and where in lowercase and let the editor do the visual separation. Both conventions are internally consistent. Neither is "correct" in any technical sense, which is exactly why formatters split roughly down the middle on the default.

The practical downside of mixed case shows up when you're skimming a long query fast: inconsistent casing (some keywords upper, some lower, because three people edited the file over two years) is genuinely harder to read than either convention applied consistently. That's the real argument for running a formatter at all, not which case it defaults to.

Comma placement is about diffs, not aesthetics

Leading commas (commas at the start of the next line, before each column) look unusual the first time you see them, but they exist for a concrete reason: version control diffs. If you add a column to the end of a SELECT list with trailing commas, the diff touches two lines, the new line and the previous line's added comma. With leading commas, adding a column only touches one line. In a codebase where SQL lives in migration files or version-controlled query definitions, that's a real, measurable improvement in diff readability.

If your SQL never goes through a diff-based review process, that whole argument evaporates and trailing commas just look more normal to more people. This is a case where the "right" answer genuinely depends on your workflow, not on some abstract notion of clean code.

"The formatting argument developers actually care about is rarely the one they think they care about. Case and commas feel like taste, but the real cost is a team spending review time debating style instead of logic. Pick one convention, write it down, stop relitigating it." - Dennis Traina, founder of 137Foundry

Indent depth interacts badly with nested subqueries

Two-space and four-space indentation both work fine for a flat SELECT... FROM... WHERE query. The disagreement gets sharper once you nest a subquery inside a WHERE clause, or stack a few CTEs (WITH clauses) on top of each other. Four-space indents make nesting visually obvious at a glance, but three or four levels deep, lines start wrapping awkwardly on a normal-width screen. Two-space indents fit more nesting on screen but can make it harder to tell, at a glance, which SELECT a given column belongs to.

indented nested brackets diagram closeup Photo by Google DeepMind on Pexels

Some formatters solve this by aligning nested clauses to their parent keyword instead of using a fixed indent width at all, which reads beautifully for a three-line subquery and turns into a jagged staircase for a ten-line one. There's no configuration that wins in every case, which is why most formatters expose indent width as a setting instead of hardcoding an opinion.

Dialect differences make "correct" formatting a moving target

Beyond pure style, formatters also have to account for dialect-specific syntax. PostgreSQL's RETURNING clause, MySQL's backtick-quoted identifiers, SQL Server's square-bracket identifiers, and BigQuery's backtick-delimited table paths all format differently even when the underlying query logic is identical. A formatter tuned for one dialect can mangle valid syntax in another, or "fix" something that wasn't broken.

This is the reason a general-purpose SQL formatter has to either declare a target dialect up front or use conservative, dialect-agnostic rules that won't break anything, even if that means it can't apply the more opinionated formatting a single-dialect tool could. It's a real tradeoff, not a bug.

Automating the choice instead of relitigating it in review

Once a team picks a convention, the fastest way to make it stick is to stop enforcing it by hand. A human reviewer flagging inconsistent keyword case in a pull request is a waste of a review cycle. That kind of check belongs in a pre-commit hook or a CI step that runs a formatter against every changed .sql file and fails the build if the output doesn't match what's already committed.

terminal window running automated checks Photo by Pixabay on Pexels

This is the same reasoning that pushed gofmt into the Go toolchain by default, and pushed Prettier into most JavaScript projects: once formatting is automatic, it stops being a topic anyone has an opinion about. Nobody argues about tab width in a Go file anymore, because gofmt made the argument moot. SQL doesn't have an equivalent default baked into a language toolchain, since SQL isn't really "compiled" by a single shared toolchain the way Go or JavaScript are, but the same pattern still works when you bolt a formatter onto your own pipeline.

The setup is usually simple: run the formatter, diff its output against the file as committed, and fail if they differ. That's a five-line CI step in most systems, and it converts an infinite, recurring style debate into a one-time decision that never needs to be revisited. Teams that skip this step tend to relitigate the same case-and-comma argument every few months, usually right after a new hire's first pull request gets nitpicked for using the "wrong" convention nobody had actually written down.

If your queries live in a dbt project, a migrations folder, or anywhere else under version control, this is worth setting up before your SQL file count gets large enough that reformatting everything at once becomes its own risky pull request. Do it early, and the formatter's specific opinions on case or commas stop mattering, because nobody has to think about them again.

What actually matters when you pick a convention

None of this is a reason to skip formatting your SQL. Consistency is the entire point, and consistency compounds: a codebase where every query follows the same case and indent rules is faster to review and easier to onboard new people into, regardless of which specific rules you picked. The practical approach:

  • Pick uppercase or lowercase keywords, not both, and apply it retroactively to old queries when you touch them.
  • If your SQL lives in version control, lean toward leading commas. If it mostly lives in a BI tool's query box, trailing commas are fine.
  • Set indent width once as a team decision and stop revisiting it per pull request.
  • If you work across multiple database engines, confirm your formatter supports each dialect you actually use rather than assuming ANSI SQL rules translate cleanly.

You can run a query through EvvyTools' SQL Formatter to see one consistent, opinionated take on all four of these questions at once, applied instantly to whatever you paste in. It won't settle the industry-wide argument, but it will settle it for your team, which is the part that actually matters day to day.

Formatting is a readability tool, not a linting tool

It's worth being clear about what a formatter does and doesn't do. Formatting changes whitespace, case, and layout. It does not validate that your query is logically correct, that your joins are on the right keys, or that your WHERE clause matches your intent. Treat a formatter as a readability pass you run before a query goes into review, not as a substitute for actually reading what the query does.

entity relationship diagram with connected tables Photo by RDNE Stock project on Pexels

That distinction matters most in code review. A well-formatted query is faster for a reviewer to parse, which means they have more attention left over to check the actual logic instead of getting stuck decoding inconsistent indentation. That's the entire economic case for running every query you write through a formatter: not because unformatted SQL is wrong, but because formatted SQL gets reviewed better and faster.

If you're setting a team standard from scratch, start from whatever your EvvyTools tools directory default produces, write it down in your team's contributing guide, and move on. For more background on general SQL query practices, Wikipedia's SQL article covers the standard's history and dialect fragmentation in more depth, and PostgreSQL's documentation is a solid reference if Postgres is your primary dialect. For a broader look at how style guides get adopted across a language ecosystem, PEP 8 is the closest analogue SQL never quite got, and it's worth reading even if you never write a line of Python, just to see what SQL is missing. You can also browse the full EvvyTools blog for more deep dives like this one.

137 Foundry — custom app building studio
Share: X Facebook LinkedIn
137 Foundry — custom app building studio