bpo-42064: Migrate to `sqlite3_create_collation_v2` by erlend-aasland · Pull Request #27156 · python/cpython

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

@erlend-aasland

As a nice side effect, we can now remove the collation dict from the
connection object, since SQLite now takes care of destroying the
callback context.

This change is needed in order to make it more convenient to add a
module state to the sqlite3 module.

https://bugs.python.org/issue42064

This implies that SQLite now takes care of destroying the callback
context (the PyObject callable it has been passed), so we can strip the
collation dict from the connection object.

@erlend-aasland

@erlend-aasland

FYI, the test suite already exercise clearing a collation callback and re-registering a collation callback.

@erlend-aasland

BTW, I'd like to change the finally label at the end of pysqlite_connection_init_impl to error, for consistency with the rest of the code base.

encukou

Copy link

Member

@encukou encukou left a comment

Looks good so far!

I see that the beginning of the function does some busywork converting name to uppercase and checking for allowed characters. Is that limitation there only to make that sure comparison of keys in the collations dict match sqlite3_strnicmp? If so, PyUnicode_AsUTF8(name) could be used instead.

As for the finallyerror rename, I'd say finally is a better name when it's also executed in the non-error case...

@erlend-aasland

I see that the beginning of the function does some busywork converting name to uppercase and checking for allowed characters. Is that limitation there only to make that sure comparison of keys in the collations dict match sqlite3_strnicmp? If so, PyUnicode_AsUTF8(name) could be used instead.

Yes, I noticed that part as well. I haven't investigated why this was added in the first place. I'll try with PyUnicode_AsUTF8(). Thanks!

As for the finallyerror rename, I'd say finally is a better name when it's also executed in the non-error case...

True.

@erlend-aasland

FYI, the collation code (including the uppercase conversion / check), was added to pysqlite in 2006, when it still was an external project. The code seems to have been borrowed directly from apsw.

UPDATE:
SQLite accepts UTF8 collation names; there should be no reason to limit collation names to ASCII. Maybe there was a bug or a limitation in earlier SQLite versions. The apsw code has removed this limitation and now accepts UTF8 names. We can do that as well. I'll open up an issue for that.

UPDATE AGAIN:
I've created issue 44688. I'll open a PR once this is merged. PR #27395 opened.

@encukou

erlend-aasland added a commit to erlend-aasland/cpython that referenced this issue

Jul 27, 2021

JuniorJPDJ added a commit to JuniorJPDJ/cpython that referenced this issue

Aug 12, 2021
This implies that SQLite now takes care of destroying the callback
context (the PyObject callable it has been passed), so we can strip the
collation dict from the connection object.