-
-
Notifications
You must be signed in to change notification settings - Fork 31.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-130798: Add type hints to pathlib.types #131639
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm worried about the covariant returns that are lacking but I don't remember if it's allowed to have a method with def f(self: T_co) -> T_co: ...
Thanks for the review!
Could you explain what you mean specifically? |
I should avoid reviewing while doing C++. Invariance is sufficient here, my bad. |
- pathsegments are `str`: joinpath, __truediv__, __rtruediv__, with_segments - symlink_to target is `str` - glob recurse_symlinks is `Literal[True]` - full_match and glob pattern is `str`
@barneygale @picnixz I made all changes as suggested I am now wondering if it wouldn't be better to just use |
That's a much better plan - please feel free. |
What do you chaps think about |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For write_*
methods, one could return int or int / None if you want children classes to only return None if it's too hard for them to know how many characters were written. WDYT @barneygale?
Given that |
Lib/pathlib/types.py
Outdated
""" | ||
Return the path to which the symbolic link points. | ||
""" | ||
raise NotImplementedError | ||
|
||
def copy(self, target, **kwargs): | ||
def copy(self, target: _WP, **kwargs: Any) -> _WP: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to use a ParamSpec
/similar here to show that the keyword arguments are passed to target._copy_from()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spent quite a bit of time now trying to get this to work, but I believe the way _ReadablePath.copy
and _WritablePath._copy_from
are coupled is currently not representable in the Python type system.
Be aware, I am not an expert in this, so all conclusions I draw here should be confirmed with someone who is.
Ideally the typechecker would just be able to infer if keyword arguments are not compatible with the signature of a detected subclass of _WritablePath._copy_from
. But all my attempts failed so far. Basically the minimal scenario that needs to be supported is:
class R:
def copy[T: W](self, target: T, **kwargs) -> T:
target._copy_from(self, **kwargs)
return target
class W:
def _copy_from(self, source: R, /, *, follow_symlinks: Literal[True] = True) -> None:
return
-
In this example: "not typing the kwargs" (or typing them as Any) leads to missing incorrect keyword arguments: https://mypy-play.net/?mypy=latest&python=3.13&gist=7608e8dcd5e4b58dc2fc2c355bf6ed89
-
Trying to use TypedDicts to define the kwargs doesn't work either because the definitions would have to be provided to
_ReadablePath.copy
for each speficic target class: https://mypy-play.net/?mypy=latest&python=3.13&gist=12f4cbdd725073d8a152a90a941922cf -
ParamSpec looks pretty promising but to be able to actually bind the parameters, you need to go via a protocol class, which prevents you from reusing the actual target type as return type of the
_ReadablePath.copy
method. Moreover ParamSpec can't be used without binding P.args to the function signature: https://mypy-play.net/?mypy=latest&python=3.13&gist=b47ea834a0f57a722b8124f76f3fb6d4
Ideally we'd be able to do something like:
class R:
def copy[T: W, **P](self, target: T[P], **kwargs: P.kwargs) -> T[P]: ...
class W[**P]:
def _copy_from(self, source: R, **kwargs: P.kwargs) -> None: ...
But from reading through a bunch of docs and issues, I this seems impossible. I believe the correct related issue here should be this one python/typing#548 for support of higher-kinded typevars, and we'd need kwargs_only
support for ParamSpec: python/typing#1524
My guess is that by redesigning the coupling between _ReadablePath.copy
and _WritablePath._copy_from
it should be possible to better represent this in the type system. But it's getting late here, and I think I'll revisit this with a fresh mind tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intersection types might also work, since then you could do the following:
def copy[T: W, **P](self, target: T & SupportsCopyFrom[P], *args: P.args, **kwargs: P.kwargs) -> T: ...
This would mean that target
must be both a T
and SupportsCopyFrom
simultaneously. Intersections I think are being considered more seriously than HKT, but they're also another quite substantial addition to the type system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the analysis! Feels like we're pushing the current type system a bit far, so I'm happy with Any
. I could log an issue in pathlib-abc and link back to this thread.
I'm also happy to change how copy()
and _copy_from()
interact if there's something better we can do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I have a solution, though it's awkward. First change is to pass *args
too, meaning ParamSpec
works. Second is to require _copy_to()
to redundantly return self
. Then we could do the following:
class _HasCopyTo[**Args, R](Protocol):
def _copy_to(
self, path: ReadablePath, /,
*args: Args.args, **kwargs: Args.kwargs) -> R: ...
class ReadablePath:
def copy[**Args, R](
self, path: _HasCopyTo[Args, R], /,
*args: Args.args, **kwargs: Args.kwargs,
) -> R:
res = path._copy_to(self, *args, **kwargs)
assert path is res, "Must return self"
return res
class WritablePath(ReadablePath):
@abstractmethod
def _copy_to(self, path: ReadablePath, /, *args: Any, **kwargs: Any) -> Self: ...
class Path(WritablePath):
def _copy_to(self, path: ReadablePath, /, some_option: bool) -> Self:
do_copy(self, path)
return self
The Any
in WritablePath
is required to allow subclasses to require any arguments, but it also (correctly?) means if you call copy()
with something that's just WritablePath
it'll accept anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That LGTM - hopefully it can be expressed in 3.9 syntax?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does! I'll make the change:
https://mypy-play.net/?mypy=latest&python=3.11&gist=8c300324c700eea4ffa7b7776a460f6e
code
from typing import Literal
from typing import Protocol
from typing import Generic
from typing import TypeVar
from typing import ParamSpec
from typing import Callable
from typing import Self
from typing import ClassVar
from typing import Any
R = TypeVar("R", bound="_WritablePath", covariant=True)
P = ParamSpec("P")
class _HasCopyFrom(Protocol[R, P]):
def _copy_from(self, source: _ReadablePath, /, *args: P.args, **kwargs: P.kwargs) -> R:
...
class _ReadablePath:
def copy(self, path: _HasCopyFrom[R, P], /, *args: P.args, **kwargs: P.kwargs) -> R:
return path._copy_from(self, *args, **kwargs)
class _WritablePath:
def _copy_from(self, path: _ReadablePath, /, *args: Any, **kwargs: Any) -> Self: ...
class MyPathR(_ReadablePath):
pass
class MyPathW0(_WritablePath):
def _copy_from(self, path: _ReadablePath, /, *, some_option: bool = True) -> Self:
return self
class MyPathW1(_WritablePath):
def _copy_from(self, path: _ReadablePath, /, *, other_option: int = 3) -> Self:
return self
r = MyPathR()
w0 = MyPathW0()
w1 = MyPathW1()
# correct
reveal_type(r.copy(w0))
r.copy(w0, some_option=False)
reveal_type(r.copy(w1))
r.copy(w1, other_option=4)
# error
r.copy(w0, dont_exist=2)
r.copy(w1, text="str")
r.copy(w0, some_option=1.0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, I tried implementing the changes, and they don't fully work. While they solve the copy (*args
and) **kwargs
issue, they cause type checking issues in for example copy_into
because mypy can't infer the actual type of path, but only that path has the structural type _HasCopyFrom
. This is a limitation of how the protocol is expressed (see here: https://mypy-play.net/?mypy=latest&python=3.12&gist=c744090b69aad82d75c14e6ab7dedb4c ). So it seems intersection types would be required to do this correctly.
Also on a side note: I now wonder if follow_symlinks
should be an explicit keyword argument in copy
and copy_into
since it's mandatory in _WritablePath._copy_from
's default implementation.
Additionally, forcing _copy_from
to return Self
would interfere with #131636
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but the problem imo is that both joinpath
and __truediv__
should actually return the same type. The workaround is nice, but to make it work, one needs to be typed as returning R
because it has to return the WritablePath instance and the other as the HasCopyFrom
protocol because it's passed again into .copy ...
So in the end you have to adjust the protocol to the exact implementation of copy and copy_into which is unfortunate. If both in copy and copy_into joinpath would be used once as the return type and once to pass as the target arg to copy, the typing workaround would not work.
I'm not sure if the benefit of being able to type check the copy kwargs justifies the introduction of this trick.
Let's keep the |
Just noticed the discussion in #130798: I'm happy to recreate this PR in typeshed if needed. |
Hello everyone,
This PR addresses #130798
As requested, the type hints are kept compatible with Python 3.9 to simplify release inpathlib-abc
.I also just noticed #131621 which I assumed will be merged soon, because it simplifies typing of
_ReadablePath.copy
and_ReadablePath.copy_into
. Otherwise, forstr
targets the_ReadablePath
subclass has to be a_WritablePath
subclass too to make sense as a target.Since, I haven't contributed type hinted code to cpython so far, it would be great if you could let me know if this PR should also ship tests for verifying the types somewhere.
Cheers,
Andreas 😃