Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DuckDB Dialect Support #738

Open
wants to merge 24 commits into
base: master
Choose a base branch
from

Conversation

hughcameron
Copy link

This pull request adds support for the DuckDB SQL dialect to the SQL Formatter library.

Description:

  • Extends the supported dialects to include DuckDB.
  • Updates the documentation (README.md and potentially others) to reflect the addition of DuckDB support.
  • Includes any necessary tests to ensure proper formatting for DuckDB queries.

Benefits:

  • Users working with DuckDB can now leverage the SQL Formatter library for consistent and readable SQL code.
  • Enhances the overall library coverage by including a popular in-memory database.

Testing:

  • A test suite is included to verify accurate formatting for various DuckDB queries.

Please review the changes and provide feedback.

Copy link

codesandbox-ci bot commented Apr 30, 2024

This pull request is automatically built and testable in CodeSandbox.

To see build info of the built libraries, click here or the icon next to each commit SHA.

@nene
Copy link
Collaborator

nene commented Apr 30, 2024

Thanks for the PR. A few quick questions and thoughts:

  • What's the relationship between DuckDB and PostgreSQL? I saw in DuckDB docs that it uses PostgreSQL parser. But does it actually support all of the PostgreSQL syntax? Like, does it support all these operators that you have listed?
  • You probably have listed too many keywords. Best to only include reserved keywords. Otherwise you'll have common field names like id, name, location, type detected as keywords and converted to uppercase when using keywordCase: "upper".
  • Your data types list seems to be missing some basic stuff like INT, CHAR, ARRAY.

I'm pretty busy this week... not sure how much time I have to properly review this.

@nene
Copy link
Collaborator

nene commented Apr 30, 2024

For bonus points, you can update the wiki with information about DuckDB. That will also make it easier for me to review this. Otherwise I'll have to go and figure out all of this about DuckDB by myself.

@PMassicotte
Copy link

I was about to submit a PR also, I will make some comment in your code.

src/languages/duckdb/duckdb.formatter.ts Show resolved Hide resolved
src/languages/duckdb/duckdb.formatter.ts Outdated Show resolved Hide resolved
src/languages/duckdb/duckdb.formatter.ts Outdated Show resolved Hide resolved
src/languages/duckdb/duckdb.keywords.ts Show resolved Hide resolved
@PMassicotte
Copy link

@hughcameron
Copy link
Author

Thanks for the comments above. The errors from the test suite are now down to five:

DuckDBFormatter
    ✕ supports ARRAY[] literals (2 ms)
    ✕ dataTypeCase option does NOT affect ARRAY[] literal case
    ✕ keywordCase option affects ARRAY[] literal case
    ✕ dataTypeCase option affects ARRAY type case (2 ms)
    ✕ supports array slice operator

@nene - I'll look into filling out the wiki 👍

@nene
Copy link
Collaborator

nene commented May 2, 2024

To fix ARRAY[] tests:

  • add ARRAY to dataTypes and remove it from keywords and functions lists.

To fix the bar[1:] test:

  • remove BAR from list of functions

I would guess the builtin BAR() function is rarely used. Well... at least I failed to find documentation of it, because it's super hard to google as Postgres docs contain loads of foo and bar in example code. I think it's better for the formatter to also support the more common use case of bar as a name used in example code.

PS. Make sure to run yarn pretty. (Looks like that currently also changes the .pre-commit-hooks.yaml file... you can let Prettier to reformat that, or leave it as-is. Either way is fine.)

@hughcameron
Copy link
Author

The test suite is passing completely now 🎉

I'll collate some notes for the wiki over the next week. Are there any other steps needed before merging?

@nene
Copy link
Collaborator

nene commented May 3, 2024

Thanks. I don't think there's anything else.

Copy link
Collaborator

@nene nene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay...

Took a bit of a look at this and noticed that several things (most notably all these operators) are not actually supported by DuckDB.

Also, when I simply compare it to PostgreSQL implementation, it looks almost the same (ignoring keywords and function names lists). But my very brief scanning of DuckDB documentation revealed several things that are different in DuckDB.

src/languages/duckdb/duckdb.formatter.ts Outdated Show resolved Hide resolved
src/languages/duckdb/duckdb.formatter.ts Outdated Show resolved Hide resolved
src/languages/duckdb/duckdb.formatter.ts Outdated Show resolved Hide resolved
'EXPLAIN',
'FETCH',
'GRANT',
'INSTALL',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like INSTALL is the only name added to this list compared to PostgreSQL. I would suspect there are more differences by the statements supported by PostgreSQL and DuckDB.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @nene - you're correct. DuckDB uses the PostgreSQL parser. That's why DuckDB’s SQL dialect closely follows the conventions of the PostgreSQL dialect with only a few exceptions as listed here.

Many features are unsupported so I've removed unsupported elements in the formatter.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @nene - let me know if it's OK to resolve this conversation.

I've also added some specific DuckDB features below.

src/languages/duckdb/duckdb.functions.ts Show resolved Hide resolved
src/languages/duckdb/duckdb.keywords.ts Show resolved Hide resolved
@Zank94
Copy link

Zank94 commented Nov 20, 2024

Hello,
We're are working on a project using DuckDB and have a few issues with the standard SQL format (->> replaced with - > > for instance).
Do you know when this PR will be merged?
Thanks

@nene
Copy link
Collaborator

nene commented Nov 20, 2024

Thanks for the interest. This thing has indeed been sitting here for a while now. Will need to dig back into this to see if there's any reasons why it hasn't been merged yet. I think one of the main reasons was that it seemed really-really similar to PostgreSQL.

So a question to you @Zank94, if you just configure SQL Formatter to treat your SQL as PostgreSQL, would that solve your problems (e.g. the ->> operator is also present in PostgreSQL).

@Zank94
Copy link

Zank94 commented Nov 21, 2024

I see, thank you for the advice I will give it a try 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants