Skip to content

Commit

Permalink
PEP 765: add report as appendix (python#4125)
Browse files Browse the repository at this point in the history
  • Loading branch information
iritkatriel authored Nov 20, 2024
1 parent 798ee6e commit 6083e27
Showing 1 changed file with 165 additions and 6 deletions.
171 changes: 165 additions & 6 deletions peps/pep-0765.rst
Original file line number Diff line number Diff line change
Expand Up @@ -54,8 +54,8 @@ flag them as a problem.
Rationale
=========

A recent `analysis of real world code
<https://github.com/iritkatriel/finally/blob/main/README.md>`_ shows that:
A recent
`analysis of real world code <https://github.com/iritkatriel/finally/blob/main/README.md>`__ shows that:

- These features are rare (2 per million LOC in the top 8,000 PyPI
packages, 4 per million LOC in a random selection of packages).
Expand All @@ -65,11 +65,13 @@ A recent `analysis of real world code
- Code owners are typically receptive to fixing the bugs, and
find that easy to do.

See `the appendix <#appendix>`__ for more details.

This new data indicates that it would benefit Python's users if
Python itself moved them away from this harmful feature.

One of the arguments brought up in `the PEP 601 discussion
<https://discuss.python.org/t/pep-601-forbid-return-break-continue-breaking-out-of-finally/2239/24>`__
One of the arguments brought up in
`the PEP 601 discussion <https://discuss.python.org/t/pep-601-forbid-return-break-continue-breaking-out-of-finally/2239/24>`__
was that language features should be orthogonal, and combine without
context-based restrictions. However, in the meantime :pep:`654` has
been implemented, and it forbids ``return``, ``break`` and ``continue``
Expand Down Expand Up @@ -150,11 +152,168 @@ How to Teach This

The change will be documented in the language spec and in the
What's New documentation. The ``SyntaxWarning`` will alert users
that their code needs to change. The `empirical evidence
<https://github.com/iritkatriel/finally/blob/main/README.md>`__
that their code needs to change. The `empirical evidence <#appendix>`__
shows that the changes necessary are typically quite
straightforward.

Appendix
========

``return`` in ``finally`` considered harmful
--------------------------------------------

Below is an abridged version of a
`research report <https://github.com/iritkatriel/finally/commits/main/README.md>`__
by Irit Katriel, which was posted on 9 Nov 2024.
It describes an investigation into usage of ``return``, ``break`` and ``continue``
in a ``finally`` clause in real world code, addressing the
questions: Are people using it? How often are they using it incorrectly?
How much churn would the proposed change create?

Method
^^^^^^

The analysis is based on the 8,000 most popular PyPI packages, in terms of number
of downloads in the last 30 days. They were downloaded on the 17th-18th of
October, using
`a script <https://github.com/faster-cpython/tools/blob/main/scripts/download_packages.py>`__
written by Guido van Rossum, which in turn relies on Hugo van Kemenade's
`tool <https://hugovk.github.io/top-pypi-packages/>`__ that creates a list of the
most popular packages.

Once downloaded, a
`second script <https://github.com/iritkatriel/finally/blob/main/scripts/ast_analysis.py>`__
was used to construct an AST for each file, and traverse it to identify ``break``,
``continue`` and ``return`` statements which are directly inside a ``finally`` block.

I then found the current source code for each occurrence, and categorized it. For
cases where the code seems incorrect, I created an issue in the project's bug
tracker. The responses to these issues are also part of the data collected in
this investigation.

Results
^^^^^^^

I decided not to include a list of the incorrect usages, out of concern that
it would make this report look like a shaming exercise. Instead I will describe
the results in general terms, but will mention that some of the problems I found
appear in very popular libraries, including a cloud security application.
For those so inclined, it should not be hard to replicate my analysis, as I
provided links to the scripts I used in the Method section.

The projects examined contained a total of 120,964,221 lines of Python code,
and among them the script found 203 instances of control flow instructions in a
``finally`` block. Most were ``return``, a handful were ``break``, and none were
``continue``. Of these:

- 46 are correct, and appear in tests that target this pattern as a feature (e.g.,
tests for linters that detect it).
- 8 seem like they could be correct - either intentionally swallowing exceptions
or appearing where an active exception cannot occur. Despite being correct, it is
not hard to rewrite them to avoid the bad pattern, and it would make the code
clearer: deliberately swallowing exceptions can be more explicitly done with
``except BaseException:``, and ``return`` which doesn't swallow exceptions can be
moved after the ``finally`` block.
- 149 were clearly incorrect, and can lead to unintended swallowing of exceptions.
These are analyzed in the next section.

**The Error Cases**

Many of the error cases followed this pattern:

.. code-block::
:class: bad
try:
...
except SomeSpecificError:
...
except Exception:
logger.log(...)
finally:
return some_value
Code like this is obviously incorrect because it deliberately logs and swallows
``Exception`` subclasses, while silently swallowing ``BaseExceptions``. The intention
here is either to allow ``BaseExceptions`` to propagate on, or (if the author is
unaware of the ``BaseException`` issue), to log and swallow all exceptions. However,
even if the ``except Exception`` was changed to ``except BaseException``, this code
would still have the problem that the ``finally`` block swallows all exceptions
raised from within the ``except`` block, and this is probably not the intention
(if it is, that can be made explicit with another ``try``-``except BaseException``).

Another variation on the issue found in real code looks like this:

.. code-block::
:class: bad
try:
...
except:
return NotImplemented
finally:
return some_value
Here the intention seems to be to return ``NotImplemented`` when an exception is
raised, but the ``return`` in the ``finally`` block would override the one in the
``except`` block.

.. note:: Following the
`discussion <https://discuss.python.org/t/an-analysis-of-return-in-finally-in-the-wild/70633/15>`__,
I repeated the analysis on a random selection of PyPI packages (to
analyze code written by *average* programmers). The sample contained
in total 77,398,892 lines of code with 316 instances of ``return``/``break``/``continue``
in ``finally``. So about 4 instances per million lines of code.

**Author reactions**

Of the 149 incorrect instances of ``return`` or ``break`` in a ``finally`` clause,
27 were out of date, in the sense that they do not appear in the main/master branch
of the library, as the code has been deleted or fixed by now. The remaining 122
are in 73 different packages, and I created an issue in each one to alert the
authors to the problems. Within two weeks, 40 of the 73 issues received a reaction
from the code maintainers:

- 15 issues had a PR opened to fix the problem.
- 20 received reactions acknowledging the problem as one worth looking into.
- 3 replied that the code is no longer maintained so this won't be fixed.
- 2 closed the issue as "works as intended", one said that they intend to
swallow all exceptions, but the other seemed unaware of the distinction
between ``Exception`` and ``BaseException``.

One issue was linked to a pre-existing open issue about non-responsiveness to Ctrl-C,
conjecturing a connection.

Two of the issue were labelled as "good first issue".

**The correct usages**

The 8 cases where the feature appears to be used correctly (in non-test code) also
deserve attention. These represent the "churn" that would be caused by blocking
the feature, because this is where working code will need to change. I did not
contact the authors in these cases, so we need to assess the difficulty of
making these changes ourselves. It is shown in
`the full report <https://github.com/iritkatriel/finally/commits/main/README.md>`__,
that the change required in each case is small.

Discussion
^^^^^^^^^^

The first thing to note is that ``return``/``break``/``continue`` in a ``finally``
block is not something we see often: 203 instance in over 120 million lines
of code. This is, possibly, thanks to the linters that warn about this.

The second observation is that most of the usages were incorrect: 73% in our
sample (149 of 203).

Finally, the author responses were overwhelmingly positive. Of the 40 responses
received within two weeks, 35 acknowledged the issue, 15 of which also created
a PR to fix it. Only two thought that the code is fine as it is, and three
stated that the code is no longer maintained so they will not look into it.

The 8 instances where the code seems to work as intended, are not hard to
rewrite.

Copyright
=========

Expand Down

0 comments on commit 6083e27

Please sign in to comment.