Mailing List Archive

Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython
Say there, the Python core development community! Have I got
a question for you!

*ahem*

Which of the following four options do you dislike least? ;-)

1) CPython continues to provide no "function signature"
objects (PEP 362) or inspect.getfullargspec() information
for any function implemented in C.

2) We add new hand-coded data structures representing the
metadata necessary for function signatures for builtins.
Which means that, when defining arguments to functions in C,
we'd need to repeat ourselves *even more* than we already do.

3) Builtin function arguments are defined using some seriously
uncomfortable and impenetrable C preprocessor macros, which
produce all the various types of output we need (argument
processing code, function signature metadata, possibly
the docstrings too).

4) Builtin function arguments are defined in a small DSL; these
are expanded to code and data using a custom compile-time
preprocessor step.


All the core devs I've asked said "given all that, I'd prefer the
hairy preprocessor macros". But by the end of the conversation
they'd changed their minds to prefer the custom DSL. Maybe I'll
make a believer out of you too--read on!


I've named this DSL preprocessor "Argument Clinic", or Clinic
for short**. Clinic works similarly to Ned Batchelder's brilliant
"Cog" tool:
http://nedbatchelder.com/code/cog/

You embed the input to Clinic in a comment in your C file,
and the output is written out immediately after that comment.
The output's overwritten every time the preprocessor is run.
In short it looks something like this:

/*[clinic]
input to the DSL
[clinic]*/

... output from the DSL, overwritten every time ...

/*[clinic end:<checksum>]*/

The input to the DSL includes all the metadata about the
function that we need for the function signature:

* the name of the function,
* the return annotation (if any),
* each parameter to the function, including
* its name,
* its type (in C),
* its default value,
* and a per-parameter docstring;
* and the docstring for the function as a whole.

The resulting output contains:

* the docstring for the function,
* declarations for all your parameters,
* C code handling all argument processing for you,
* and a #define'd methoddef structure for adding the
function to the module.


I discussed this with Mark "HotPy" Shannon, and he suggested we break
our existing C functions into two. We put the argument processing
into its own function, generated entirely by Clinic, and have the
implementation in a second function called from the first. I like
this approach simply because it makes the code cleaner. (Note that
this approach should not cause any overhead with a modern compiler,
as both functions will be "static".)

But it also provides an optimization opportunity for HotPy: it could
read the metadata, and when generating the JIT'd code it could skip
building the PyObjects and argument tuple (and possibly keyword
argument dict), and the subsequent unpacking/decoding, and just call
the implementation function directly, giving it a likely-measurable
speed boost.

And we can go further! If we add a new extension type API allowing
you to register both functions, and external modules start using it,
sophisticated Python implementations like PyPy might be able to skip
building the tuple for extension type function calls--speeding those
up too!

Another plausible benefit: alternate implementations of Python could
read the metadata--or parse the input to Clinic themselves--to ensure
their reimplementations of the Python standard library conform to the
same API!


Clinic can also run general-purpose Python code ("/*[python]").
All output from "print" is redirected into the output section
after the Python code.


As you've no doubt already guessed, I've made a prototype of
Argument Clinic. You can see it--and some sample conversions of
builtins using it for argument processing--at this BitBucket repo:

https://bitbucket.org/larry/python-clinic

I don't claim that it's fabulous, production-ready code. But it's
a definite start!


To save you a little time, here's a preview of using Clinic for
dbm.open(). The stuff at the same indent as a declaration are
options; see the "clinic.txt" in the repo above for full documentation.

/*[clinic]
dbm.open -> mapping
basename=dbmopen

const char *filename;
The filename to open.

const char *flags="r";
How to open the file. "r" for reading, "w" for writing, etc.

int mode=0666;
default=0o666
If creating a new file, the mode bits for the new file
(e.g. os.O_RDWR).

Returns a database object.

[clinic]*/

PyDoc_STRVAR(dbmopen__doc__,
"dbm.open(filename[, flags=\'r\'[, mode=0o666]]) -> mapping\n"
"\n"
" filename\n"
" The filename to open.\n"
"\n"
" flags\n"
" How to open the file. \"r\" for reading, \"w\" for writing,
etc.\n"
"\n"
" mode\n"
" If creating a new file, the mode bits for the new file\n"
" (e.g. os.O_RDWR).\n"
"\n"
"Returns a database object.\n"
"\n");

#define DBMOPEN_METHODDEF \
{"open", (PyCFunction)dbmopen, METH_VARARGS | METH_KEYWORDS,
dbmopen__doc__}

static PyObject *
dbmopen_impl(PyObject *self, const char *filename, const char *flags,
int mode);

static PyObject *
dbmopen(PyObject *self, PyObject *args, PyObject *kwargs)
{
const char *filename;
const char *flags = "r";
int mode = 0666;
static char *_keywords[] = {"filename", "flags", "mode", NULL};

if (!PyArg_ParseTupleAndKeywords(args, kwargs,
"s|si", _keywords,
&filename, &flags, &mode))
return NULL;

return dbmopen_impl(self, filename, flags, mode);
}

static PyObject *
dbmopen_impl(PyObject *self, const char *filename, const char *flags,
int mode)
/*[clinic end:eddc886e542945d959b44b483258bf038acf8872]*/


As of this writing, I also have sample conversions in the following files
available for your perusal:
Modules/_cursesmodule.c
Modules/_dbmmodule.c
Modules/posixmodule.c
Modules/zlibmodule.c
Just search in C files for '[clinic]' and you'll find everything soon
enough.

As you can see, Clinic has already survived some contact with the
enemy. I've already converted some tricky functions--for example,
os.stat() and curses.window.addch(). The latter required adding a
new positional-only processing mode for functions using a legacy
argument processing approach. (See "clinic.txt" for more.) If you
can suggest additional tricky functions to support, please do!


Big unresolved questions:

* How would we convert all the builtins to use Clinic? I fear any
solution will involve some work by hand. Even if we can automate
big chunks of it, fully automating it would require parsing arbitrary
C. This seems like overkill for a one-shot conversion.
(Mark Shannon says he has some ideas.)

* How do we create the Signature objects? My current favorite idea:
Clinic also generates a new, TBD C structure defining all the
information necessary for the signature, which is also passed in to
the new registration API (you remember, the one that takes both the
argument-processing function and the implementation function). This
is secreted away in some new part of the C function object. At
runtime this is converted on-demand into a Signature object. Default
values for arguments are represented in C as strings; the conversion
process attempts eval() on the string, and if that works it uses the
result, otherwise it simply passes through the string.

* Right now Clinic paves over the PyArg_ParseTuple API for you.
If we convert CPython to use Clinic everywhere, theoretically we
could replace the parsing API with something cleaner and/or faster.
Does anyone have good ideas (and time, and energy) here?

* There's actually a fifth option, proposed by Brett Cannon. We
constrain the format of docstrings for builtin functions to make
them machine-readable, then generate the function signature objects
from that. But consider: generating *everything* in the signature
object may get a bit tricky (e.g. Parameter.POSITIONAL_ONLY), and
this might gunk up the docstring.


But the biggest unresolved question... is this all actually a terrible
idea?


//arry/


** "Is this the right room for an argument?"
"I've told you once...!"
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com
Re: Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython [ In reply to ]
On Dec 03, 2012, at 02:29 PM, Larry Hastings wrote:

>4) Builtin function arguments are defined in a small DSL; these
> are expanded to code and data using a custom compile-time
> preprocessor step.
>
>All the core devs I've asked said "given all that, I'd prefer the
>hairy preprocessor macros". But by the end of the conversation
>they'd changed their minds to prefer the custom DSL. Maybe I'll
>make a believer out of you too--read on!

The biggest question with generated code is always the effect on debugging.
How horrible will it be when I have to step through argument parsing to figure
out what's going wrong?

-Barry
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com
Re: Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython [ In reply to ]
On Mon, Dec 3, 2012 at 2:29 PM, Larry Hastings <larry@hastings.org> wrote:

>
> Say there, the Python core development community! Have I got
> a question for you!
>
> *ahem*
>
> Which of the following four options do you dislike least? ;-)
>
> 1) CPython continues to provide no "function signature"
> objects (PEP 362) or inspect.getfullargspec() information
> for any function implemented in C.
>
>
yuck on #1, though this is what happens by default if we don't do anything
nice.


> 2) We add new hand-coded data structures representing the
> metadata necessary for function signatures for builtins.
> Which means that, when defining arguments to functions in C,
> we'd need to repeat ourselves *even more* than we already do.
>
>
yuck on #2.


> 3) Builtin function arguments are defined using some seriously
> uncomfortable and impenetrable C preprocessor macros, which
> produce all the various types of output we need (argument
> processing code, function signature metadata, possibly
> the docstrings too).
>

Likely painful to maintain. C++ templates would likely be easier.


>
> 4) Builtin function arguments are defined in a small DSL; these
> are expanded to code and data using a custom compile-time
> preprocessor step.


>
> All the core devs I've asked said "given all that, I'd prefer the
> hairy preprocessor macros". But by the end of the conversation
> they'd changed their minds to prefer the custom DSL. Maybe I'll
> make a believer out of you too--read on!
>

It always strikes me that C++ could be such a DSL that could likely be used
for this purpose rather than defining and maintaining our own "yet another
C preprocessor" step. But I don't have suggestions and we're not allowing
C++ so... nevermind. :)


> I've named this DSL preprocessor "Argument Clinic", or Clinic
> for short**. Clinic works similarly to Ned Batchelder's brilliant
> "Cog" tool:
> http://nedbatchelder.com/code/**cog/<http://nedbatchelder.com/code/cog/>
>
> You embed the input to Clinic in a comment in your C file,
> and the output is written out immediately after that comment.
> The output's overwritten every time the preprocessor is run.
> In short it looks something like this:
>
> /*[clinic]
> input to the DSL
> [clinic]*/
>
> ... output from the DSL, overwritten every time ...
>
> /*[clinic end:<checksum>]*/
>
> The input to the DSL includes all the metadata about the
> function that we need for the function signature:
>
> * the name of the function,
> * the return annotation (if any),
> * each parameter to the function, including
> * its name,
> * its type (in C),
> * its default value,
> * and a per-parameter docstring;
> * and the docstring for the function as a whole.
>
> The resulting output contains:
>
> * the docstring for the function,
> * declarations for all your parameters,
> * C code handling all argument processing for you,
> * and a #define'd methoddef structure for adding the
> function to the module.
>
>
> I discussed this with Mark "HotPy" Shannon, and he suggested we break
> our existing C functions into two. We put the argument processing
> into its own function, generated entirely by Clinic, and have the
> implementation in a second function called from the first. I like
> this approach simply because it makes the code cleaner. (Note that
> this approach should not cause any overhead with a modern compiler,
> as both functions will be "static".)
>
> But it also provides an optimization opportunity for HotPy: it could
> read the metadata, and when generating the JIT'd code it could skip
> building the PyObjects and argument tuple (and possibly keyword
> argument dict), and the subsequent unpacking/decoding, and just call
> the implementation function directly, giving it a likely-measurable
> speed boost.
>
> And we can go further! If we add a new extension type API allowing
> you to register both functions, and external modules start using it,
> sophisticated Python implementations like PyPy might be able to skip
> building the tuple for extension type function calls--speeding those
> up too!
>
> Another plausible benefit: alternate implementations of Python could
> read the metadata--or parse the input to Clinic themselves--to ensure
> their reimplementations of the Python standard library conform to the
> same API!
>
>
> Clinic can also run general-purpose Python code ("/*[python]").
> All output from "print" is redirected into the output section
> after the Python code.
>
>
> As you've no doubt already guessed, I've made a prototype of
> Argument Clinic. You can see it--and some sample conversions of
> builtins using it for argument processing--at this BitBucket repo:
>
> https://bitbucket.org/larry/**python-clinic<https://bitbucket.org/larry/python-clinic>
>
> I don't claim that it's fabulous, production-ready code. But it's
> a definite start!
>
>
> To save you a little time, here's a preview of using Clinic for
> dbm.open(). The stuff at the same indent as a declaration are
> options; see the "clinic.txt" in the repo above for full documentation.
>
> /*[clinic]
> dbm.open -> mapping
> basename=dbmopen
>
> const char *filename;
> The filename to open.
>
> const char *flags="r";
> How to open the file. "r" for reading, "w" for writing, etc.
>
> int mode=0666;
> default=0o666
> If creating a new file, the mode bits for the new file
> (e.g. os.O_RDWR).
>
> Returns a database object.
>
> [clinic]*/
>
> PyDoc_STRVAR(dbmopen__doc__,
> "dbm.open(filename[, flags=\'r\'[, mode=0o666]]) -> mapping\n"
> "\n"
> " filename\n"
> " The filename to open.\n"
> "\n"
> " flags\n"
> " How to open the file. \"r\" for reading, \"w\" for writing,
> etc.\n"
> "\n"
> " mode\n"
> " If creating a new file, the mode bits for the new file\n"
> " (e.g. os.O_RDWR).\n"
> "\n"
> "Returns a database object.\n"
> "\n");
>
> #define DBMOPEN_METHODDEF \
> {"open", (PyCFunction)dbmopen, METH_VARARGS | METH_KEYWORDS,
> dbmopen__doc__}
>
> static PyObject *
> dbmopen_impl(PyObject *self, const char *filename, const char *flags,
> int mode);
>
> static PyObject *
> dbmopen(PyObject *self, PyObject *args, PyObject *kwargs)
> {
> const char *filename;
> const char *flags = "r";
> int mode = 0666;
> static char *_keywords[] = {"filename", "flags", "mode", NULL};
>
> if (!PyArg_ParseTupleAndKeywords(**args, kwargs,
> "s|si", _keywords,
> &filename, &flags, &mode))
> return NULL;
>
> return dbmopen_impl(self, filename, flags, mode);
> }
>
> static PyObject *
> dbmopen_impl(PyObject *self, const char *filename, const char *flags,
> int mode)
> /*[clinic end:**eddc886e542945d959b44b483258bf**038acf8872]*/
>
>
> As of this writing, I also have sample conversions in the following files
> available for your perusal:
> Modules/_cursesmodule.c
> Modules/_dbmmodule.c
> Modules/posixmodule.c
> Modules/zlibmodule.c
> Just search in C files for '[clinic]' and you'll find everything soon
> enough.
>
> As you can see, Clinic has already survived some contact with the
> enemy. I've already converted some tricky functions--for example,
> os.stat() and curses.window.addch(). The latter required adding a
> new positional-only processing mode for functions using a legacy
> argument processing approach. (See "clinic.txt" for more.) If you
> can suggest additional tricky functions to support, please do!
>
>
> Big unresolved questions:
>
> * How would we convert all the builtins to use Clinic? I fear any
> solution will involve some work by hand. Even if we can automate
> big chunks of it, fully automating it would require parsing arbitrary
> C. This seems like overkill for a one-shot conversion.
> (Mark Shannon says he has some ideas.)
>

A lot of hand work. Sprints at pycon. etc. Automating nice chunks of it
could be partially done for some easy cases such as things that only use
ParseTuple today.


>
> * How do we create the Signature objects? My current favorite idea:
> Clinic also generates a new, TBD C structure defining all the
> information necessary for the signature, which is also passed in to
> the new registration API (you remember, the one that takes both the
> argument-processing function and the implementation function). This
> is secreted away in some new part of the C function object. At
> runtime this is converted on-demand into a Signature object. Default
> values for arguments are represented in C as strings; the conversion
> process attempts eval() on the string, and if that works it uses the
> result, otherwise it simply passes through the string.
>

I think passing on the string if that doesn't work is wrong. It could lead
to a behavior change not realized until runtime due to some other possibly
unrelated thing causing the eval to fail. A failure to eval() one of these
strings should result in an ImportError from the extension module's init or
a fatal failure if it is a builtin. (I'm assuming these would be done at
extension module import time at or after the end of the module init
function)


>
> * Right now Clinic paves over the PyArg_ParseTuple API for you.
> If we convert CPython to use Clinic everywhere, theoretically we
> could replace the parsing API with something cleaner and/or faster.
> Does anyone have good ideas (and time, and energy) here?
>

By "paves over" do you mean that Clinic is currently using the ParseTuple
API in its generated code? Yes, we should do better. But don't hold Clinic
up on that. In fact allowing a version of Clinic to work stand alone as a
PyPI project and generate Python 2.7 and 3.2/3.3 extension module
boilerplate could would increase its adoption and improve the quality of
some existing extension modules that choose to use it.

My first take on this would be to do the obvious and expand the code within
the case/switch statement in the loop that ParseTuple ends up in directly
so that we're just generating raw parameter validation and acceptance code
based on the clinic definition. I've never liked things in C that parse a
string at runtime to determine behavior. (please don't misinterpret that
to suggest I don't like Python ;)


> * There's actually a fifth option, proposed by Brett Cannon. We
> constrain the format of docstrings for builtin functions to make
> them machine-readable, then generate the function signature objects
> from that. But consider: generating *everything* in the signature
> object may get a bit tricky (e.g. Parameter.POSITIONAL_ONLY), and
> this might gunk up the docstring.
>
>
> But the biggest unresolved question... is this all actually a terrible
> idea?
>

No it is not. I like it.

I don't _like_ adding another C preprocessor but I think if we keep it very
limited it is a perfectly reasonable thing to do as part of our build
process.


>
> //arry/
>
>
> ** "Is this the right room for an argument?"
> "I've told you once...!"
> ______________________________**_________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/**mailman/listinfo/python-dev<http://mail.python.org/mailman/listinfo/python-dev>
> Unsubscribe: http://mail.python.org/**mailman/options/python-dev/**
> greg%40krypto.org<http://mail.python.org/mailman/options/python-dev/greg%40krypto.org>
>
Re: Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython [ In reply to ]
+1 to what Greg said.

--
Sent from my phone, thus the relative brevity :)
Re: Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython [ In reply to ]
On 12/3/2012 3:42 PM, Gregory P. Smith wrote:
>
> All the core devs I've asked said "given all that, I'd prefer the
> hairy preprocessor macros". But by the end of the conversation
> they'd changed their minds to prefer the custom DSL. Maybe I'll
> make a believer out of you too--read on!
>
>
> It always strikes me that C++ could be such a DSL that could likely be
> used for this purpose rather than defining and maintaining our own
> "yet another C preprocessor" step. But I don't have suggestions and
> we're not allowing C++ so... nevermind. :)

C++ has enough power to delude many (including me) into thinking that it
could be used this way.... but in my experience, it isn't quite there.
There isn't quite enough distinction between various integral types to
achieve the goals I once had, anyway... and that was some 15 years
ago... but for compatibility reasons, I doubt it has improved in that area.

Glenn
Re: Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython [ In reply to ]
On 12/03/2012 03:42 PM, Gregory P. Smith wrote:
> On Mon, Dec 3, 2012 at 2:29 PM, Larry Hastings <larry@hastings.org
> <mailto:larry@hastings.org>> wrote:
>
> Default
> values for arguments are represented in C as strings; the conversion
> process attempts eval() on the string, and if that works it uses the
> result, otherwise it simply passes through the string.
>
>
> I think passing on the string if that doesn't work is wrong. It could
> lead to a behavior change not realized until runtime due to some other
> possibly unrelated thing causing the eval to fail.

Good point. I amend my proposal to say: we make this explicit rather
than implicit. We declare an additional per-parameter flag that says
"don't eval this, just pass through the string". In absence of this
flag, the struct-to-Signature-izer runs eval on the string and complains
noisily if it fails.

> * Right now Clinic paves over the PyArg_ParseTuple API for you.
> If we convert CPython to use Clinic everywhere, theoretically we
> could replace the parsing API with something cleaner and/or faster.
> Does anyone have good ideas (and time, and energy) here?
>
>
> By "paves over" do you mean that Clinic is currently using the
> ParseTuple API in its generated code?

Yes. Specifically, it uses ParseTuple for "positional-only" argument
processing, and ParseTupleAndKeywords for all others. You can see the
latter in the sample output in my original email.


> Yes, we should do better. But don't hold Clinic up on that.

As I have not!


> But the biggest unresolved question... is this all actually a terrible
> idea?
>
>
> No it is not. I like it.

\o/


//arry/
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com
Re: Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython [ In reply to ]
On 12/03/2012 02:37 PM, Barry Warsaw wrote:
> The biggest question with generated code is always the effect on debugging.
> How horrible will it be when I have to step through argument parsing to figure
> out what's going wrong?

Right now, it's exactly like the existing solution. The generated
function looks more or less like the top paragraph of the old code did;
it declares variables, with defaults where appropriate, it calls
PyArg_ParseMumbleMumble, if that fails it returns NULL, and otherwise it
calls the impl function. There *was* an example of generated code in my
original email; I encourage you to go back and take a look. For more
you can look at the bitbucket repo; the output of the DSL is checked in
there, as would be policy if we went with Clinic.

TBH I think debuggability is one of the strengths of this approach.
Unlike C macros, here all the code is laid out in front of you,
formatted for easy reading. And it's not terribly complicated code.

If we change the argument parsing code to use some new API, one hopes we
will have the wisdom to make it /easier/ to read than PyArg_*.


//arry/
Re: Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython [ In reply to ]
On Tue, Dec 4, 2012 at 8:37 AM, Barry Warsaw <barry@python.org> wrote:

> On Dec 03, 2012, at 02:29 PM, Larry Hastings wrote:
>
> >4) Builtin function arguments are defined in a small DSL; these
> > are expanded to code and data using a custom compile-time
> > preprocessor step.
> >
> >All the core devs I've asked said "given all that, I'd prefer the
> >hairy preprocessor macros". But by the end of the conversation
> >they'd changed their minds to prefer the custom DSL. Maybe I'll
> >make a believer out of you too--read on!
>
> The biggest question with generated code is always the effect on debugging.
> How horrible will it be when I have to step through argument parsing to
> figure
> out what's going wrong?
>

That's the advantage of the Cog-style approach that modifies the C source
files in place and records checksums so the generator can easily tell when
the code needs to be regenerated, either because it was changed via hand
editing or because the definition changed. Yes, it violates the guideline
of "don't check in generated code", but it makes debugging sane.

Cheers,
Nick.

--
Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Re: Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython [ In reply to ]
Am 04.12.2012 um 00:42 schrieb Gregory P. Smith <greg@krypto.org>:

> * How would we convert all the builtins to use Clinic? I fear any
> solution will involve some work by hand. Even if we can automate
> big chunks of it, fully automating it would require parsing arbitrary
> C. This seems like overkill for a one-shot conversion.
> (Mark Shannon says he has some ideas.)
>
> A lot of hand work. Sprints at pycon. etc. Automating nice chunks of it could be partially done for some easy cases such as things that only use ParseTuple today.

I don’t see this as a big problem. There’s always lots of people who want to get into Python hacking and don’t know where to start. These are easily digestible pieces that can be *reviewed in a timely manner*, thus ideal. We could even do some (virtual) sprint just on that.

As for Larry: great approach, I’m impressed!
Re: Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython [ In reply to ]
Le Mon, 03 Dec 2012 14:29:35 -0800,
Larry Hastings <larry@hastings.org> a écrit :
>
> /*[clinic]
> dbm.open -> mapping
> basename=dbmopen
>
> const char *filename;
> The filename to open.

So how does it handle the fact that filename can either be a unicode
string or a fsencoding-encoded bytestring? And how does it do the right
encoding/decoding dance, possibly platform-specific?

> static char *_keywords[] = {"filename", "flags", "mode", NULL};
>
> if (!PyArg_ParseTupleAndKeywords(args, kwargs,
> "s|si", _keywords,
> &filename, &flags, &mode))
> return NULL;

I see, it doesn't :-)

> But the biggest unresolved question... is this all actually a terrible
> idea?

I like the idea, but it needs more polishing. I don't think the various
"duck types" accepted by Python can be expressed fully in plain C types
(e.g. you must distinguish between taking all kinds of numbers or only
an __index__-providing number).

Regards

Antoine.


_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com
Re: Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython [ In reply to ]
Am 03.12.2012 23:29, schrieb Larry Hastings:
[...autogen some code from special comment strings...]
> /*[clinic]
> dbm.open -> mapping
> basename=dbmopen
>
> const char *filename;
> The filename to open.
>
> const char *flags="r";
> How to open the file. "r" for reading, "w" for writing, etc.
>
> int mode=0666;
> default=0o666
> If creating a new file, the mode bits for the new file
> (e.g. os.O_RDWR).
>
> Returns a database object.
>
> [clinic]*/

Firstly, I like the idea. Even though this "autogenerate in-place" seems
a bit strange at first, I don't think it really hurts in practice. Also,
thanks for introducing me to the 'cog' tool, I think I'll use this now
and then!

This also brings me to a single question I have for your proposal: Why
did you create another DSL instead of using Python, i.e. instead of
using cog directly? Looking at the above, I could imagine this being
written like this instead:

/*[.[.[.cog
import pycognize
with pycognize.function('dbmopen') as f:
f.add_param('self')
f.add_kwparam('filename',
doc='The filename to open',
c_type='char*')
f.add_kwparam('flags',
doc='How to open the file.'
c_type='char*',
default='r')
f.set_result('mapping')
]]]*/
//[[[end]]]

Cheers!

Uli


**************************************************************************************
Domino Laser GmbH, Fangdieckstra�e 75a, 22547 Hamburg, Deutschland
Gesch�ftsf�hrer: Hans Robert Dapprich, Amtsgericht Hamburg HR B62 932
**************************************************************************************
Visit our website at http://www.dominolaser.com
**************************************************************************************
Diese E-Mail einschlie�lich s�mtlicher Anh�nge ist nur f�r den Adressaten bestimmt und kann vertrauliche Informationen enthalten. Bitte benachrichtigen Sie den Absender umgehend, falls Sie nicht der beabsichtigte Empf�nger sein sollten. Die E-Mail ist in diesem Fall zu l�schen und darf weder gelesen, weitergeleitet, ver�ffentlicht oder anderweitig benutzt werden.
E-Mails k�nnen durch Dritte gelesen werden und Viren sowie nichtautorisierte �nderungen enthalten. Domino Laser GmbH ist f�r diese Folgen nicht verantwortlich.
**************************************************************************************

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com
Re: Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython [ In reply to ]
On 12/04/2012 04:10 AM, Ulrich Eckhardt wrote:
> This also brings me to a single question I have for your proposal: Why
> did you create another DSL instead of using Python, i.e. instead of
> using cog directly? Looking at the above, I could imagine this being
> written like this instead:

Actually my original prototype was written using Cog. When I showed it
to Guido at EuroPython, he suggested a DSL instead, as writing raw
Python code for every single function would be far too wordy. I agree.


//arry/
Re: Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython [ In reply to ]
Larry Hastings, 03.12.2012 23:29:
> Say there, the Python core development community! Have I got
> a question for you!
>
> *ahem*
>
> Which of the following four options do you dislike least? ;-)
>
> 1) CPython continues to provide no "function signature"
> objects (PEP 362) or inspect.getfullargspec() information
> for any function implemented in C.

I would love to see Cython generated functions look and behave completely
like normal Python functions at some point, so this is the option I dislike
most.


> 2) We add new hand-coded data structures representing the
> metadata necessary for function signatures for builtins.
> Which means that, when defining arguments to functions in C,
> we'd need to repeat ourselves *even more* than we already do.
>
> 3) Builtin function arguments are defined using some seriously
> uncomfortable and impenetrable C preprocessor macros, which
> produce all the various types of output we need (argument
> processing code, function signature metadata, possibly
> the docstrings too).
>
> 4) Builtin function arguments are defined in a small DSL; these
> are expanded to code and data using a custom compile-time
> preprocessor step.
> [...]
> * There's actually a fifth option, proposed by Brett Cannon. We
> constrain the format of docstrings for builtin functions to make
> them machine-readable, then generate the function signature objects
> from that. But consider: generating *everything* in the signature
> object may get a bit tricky (e.g. Parameter.POSITIONAL_ONLY), and
> this might gunk up the docstring.

Why not provide a constructor for signature objects that parses the
signature from a string? For a signature like

def func(int arg1, float arg2, ExtType arg3, *,
object arg4=None) -> ExtType2:
...

you'd just pass in this string:

(arg1 : int, arg2 : float, arg3 : ExtType, *, arg4=None) -> ExtType2

or maybe prefixed by the function name, don't care. Might make it easier to
pass it into the normal parser.

For more than one alternative input type, use a tuple of types. For builtin
types that are shadowed by C type names, pass "builtins.int" etc.

Stefan


_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com
Re: Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython [ In reply to ]
On Mon, 2012-12-03 at 14:29 -0800, Larry Hastings wrote:

[...snip compelling sales pitch...]

I like the idea.

As noted elsewhere, sane generated C code is much easier to step through
in the debugger than preprocessor macros (though "sane" in that sentence
is begging the question, I guess, but the examples you post look good to
me). It's also seems cleaner to split the argument handling from the
implementation of the function (iirc Cython already has an analogous
split and can use this to bypass arg tuple creation).

The proposal potentially also eliminates a source of bugs: mismatches
between the format strings in PyArg_Parse* vs the underlying C types
passed as varargs (which are a major pain for bigendian CPUs where int
vs long screwups can really bite you).

I got worried that this could introduce a bootstrapping issue (given
that the clinic is implemented using python itself), but given that the
generated code is checked in as part of the C source file, you always
have the source you need to regenerate the interpreter.

Presumably 3rd party extension modules could use this also, in which
case the clinic tool could be something that could be installed/packaged
as part of Python 3.4 ?

[...snip...]

> Big unresolved questions:
>
> * How would we convert all the builtins to use Clinic? I fear any
> solution will involve some work by hand. Even if we can automate
> big chunks of it, fully automating it would require parsing arbitrary
> C. This seems like overkill for a one-shot conversion.
> (Mark Shannon says he has some ideas.)

Potentially my gcc python plugin could be used to autogenerate things.
FWIW I already have Python code running inside gcc that can parse the
PyArg_* APIs:
http://git.fedorahosted.org/cgit/gcc-python-plugin.git/tree/libcpychecker/PyArg_ParseTuple.py

Though my plugin runs after the C preprocessor has been run, so it may
be fiddly to use this to autogenerate patches.


Hope this is helpful
Dave

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com
Re: Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython [ In reply to ]
On 12/04/2012 01:08 AM, Antoine Pitrou wrote:
> Le Mon, 03 Dec 2012 14:29:35 -0800,
> Larry Hastings <larry@hastings.org> a écrit :
>> /*[clinic]
>> dbm.open -> mapping
>> basename=dbmopen
>>
>> const char *filename;
>> The filename to open.
> So how does it handle the fact that filename can either be a unicode
> string or a fsencoding-encoded bytestring? And how does it do the right
> encoding/decoding dance, possibly platform-specific?
>
> [...]
> I see, it doesn't :-)

If you compare the Clinic-generated code to the current implementation
of dbm.open (and all the other functions I've touched) you'll find the
"format units" specified to PyArg_Parse* are identical. Thus I assert
the replacement argument parsing is no worse (and no better) than what's
currently shipping in Python.

Separately, I contributed code that handles unicode vs bytes for
filenames in a reasonably cross-platform way; see "path_converter" in
Modules/posixmodule.c. (This shipped in Python 3.3.) And indeed, I
have examples of using "path_converter" with Clinic in my branch.

Along these lines, I've been contemplating proposing that Clinic
specifically understand "path" arguments, distinctly from other string
arguments, as they are both common and rarely handled correctly. My
main fear is that I probably don't understand all their complexities
either ;-)

Anyway, this is certainly something we can consider *improving* for
Python 3.4. But for now I'm trying to make Clinic an indistinguishable
drop-in replacement.


> I like the idea, but it needs more polishing. I don't think the various
> "duck types" accepted by Python can be expressed fully in plain C types
> (e.g. you must distinguish between taking all kinds of numbers or only
> an __index__-providing number).

Naturally I agree Clinic needs more polishing. But the problem you fear
is already solved. Clinic allows precisely expressing any existing
PyArg_ "format unit"** through a combination of the type of the
parameter and its "flags". The flags only become necessary for types
used by multiple format units; for example, s, z, es, et, es#, et#, y,
and y# all map to char *, so it's necessary to disambiguate by using the
"flags". The specific case you cite ("__index__-providing number") is
already unambiguous; that's n, mapped to Py_ssize_t. There aren't any
other format units that map to a Py_ssize_t, so we're done.

** Well, any format unit except w*. I don't handle it just because I
wasn't sure how best to do so.


//arry/
Re: Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython [ In reply to ]
On Mon, Dec 3, 2012 at 5:29 PM, Larry Hastings <larry@hastings.org> wrote:

>
> Say there, the Python core development community! Have I got
> a question for you!
>
> *ahem*
>
> Which of the following four options do you dislike least? ;-)
>
> 1) CPython continues to provide no "function signature"
> objects (PEP 362) or inspect.getfullargspec() information
> for any function implemented in C.
>
> 2) We add new hand-coded data structures representing the
> metadata necessary for function signatures for builtins.
> Which means that, when defining arguments to functions in C,
> we'd need to repeat ourselves *even more* than we already do.
>
> 3) Builtin function arguments are defined using some seriously
> uncomfortable and impenetrable C preprocessor macros, which
> produce all the various types of output we need (argument
> processing code, function signature metadata, possibly
> the docstrings too).
>
> 4) Builtin function arguments are defined in a small DSL; these
> are expanded to code and data using a custom compile-time
> preprocessor step.
>
>
> All the core devs I've asked said "given all that, I'd prefer the
> hairy preprocessor macros". But by the end of the conversation
> they'd changed their minds to prefer the custom DSL. Maybe I'll
> make a believer out of you too--read on!
>
>
[snip]


> * There's actually a fifth option, proposed by Brett Cannon. We
> constrain the format of docstrings for builtin functions to make
> them machine-readable, then generate the function signature objects
> from that. But consider: generating *everything* in the signature
> object may get a bit tricky (e.g. Parameter.POSITIONAL_ONLY), and
> this might gunk up the docstring.
>

I should mention that I was one of the people Larry pitched this to and
this fifth option was before I fully understood the extent the DSL
supported the various crazy options needed to support all current use-cases
in CPython.

Regardless I fully support what Larry is proposing.
Re: Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython [ In reply to ]
On Dec 04, 2012, at 11:47 AM, David Malcolm wrote:

>As noted elsewhere, sane generated C code is much easier to step through
>in the debugger than preprocessor macros (though "sane" in that sentence
>is begging the question, I guess, but the examples you post look good to
>me).

And to me too.

-Barry
Re: Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython [ In reply to ]
On Tue, 04 Dec 2012 11:04:09 -0800
Larry Hastings <larry@hastings.org> wrote:
>
> Along these lines, I've been contemplating proposing that Clinic
> specifically understand "path" arguments, distinctly from other string
> arguments, as they are both common and rarely handled correctly. My
> main fear is that I probably don't understand all their complexities
> either ;-)
>
> Anyway, this is certainly something we can consider *improving* for
> Python 3.4. But for now I'm trying to make Clinic an indistinguishable
> drop-in replacement.
>
[...]
>
> Naturally I agree Clinic needs more polishing. But the problem you fear
> is already solved. Clinic allows precisely expressing any existing
> PyArg_ "format unit"** through a combination of the type of the
> parameter and its "flags".

Very nice then! Your work is promising, and I hope we'll see a version
of it some day in Python 3.4 (or 3.4+k).

Regards

Antoine.


_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com
Re: Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython [ In reply to ]
On Tue, Dec 4, 2012 at 11:35 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
> On Tue, 04 Dec 2012 11:04:09 -0800
> Larry Hastings <larry@hastings.org> wrote:
>>
>> Along these lines, I've been contemplating proposing that Clinic
>> specifically understand "path" arguments, distinctly from other string
>> arguments, as they are both common and rarely handled correctly. My
>> main fear is that I probably don't understand all their complexities
>> either ;-)
>>
>> Anyway, this is certainly something we can consider *improving* for
>> Python 3.4. But for now I'm trying to make Clinic an indistinguishable
>> drop-in replacement.
>>
> [...]
>>
>> Naturally I agree Clinic needs more polishing. But the problem you fear
>> is already solved. Clinic allows precisely expressing any existing
>> PyArg_ "format unit"** through a combination of the type of the
>> parameter and its "flags".
>
> Very nice then! Your work is promising, and I hope we'll see a version
> of it some day in Python 3.4 (or 3.4+k).

+1 for getting this into 3.4. Does it need a PEP, or just a bug
tracker item + code review? I think the latter is fine -- it's
probably better not to do too much bikeshedding but just to let Larry
propose a patch, have it reviewed and submitted, and then iterate.
It's also okay if it is initially used for only a subset of extension
modules (and even if some functions/methods can't be expressed using
it yet).

--
--Guido van Rossum (python.org/~guido)
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com
Re: Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython [ In reply to ]
Hi,

On Mon, Dec 3, 2012 at 3:42 PM, Gregory P. Smith <greg@krypto.org> wrote:
> In fact allowing a version of Clinic to work stand alone as a
> PyPI project and generate Python 2.7 and 3.2/3.3 extension module
> boilerplate could would increase its adoption and improve the quality of
> some existing extension modules that choose to use it.

I agree: the same idea applies equally well to all existing 3rd-party
extension modules, and does not depend on new CPython C API functions
(so far), so Clinic should be released as a PyPI project too.


A bientôt,

Armin.
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com
Re: Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython [ In reply to ]
On Tue, Dec 4, 2012 at 4:17 PM, Guido van Rossum <guido@python.org> wrote:

> On Tue, Dec 4, 2012 at 11:35 AM, Antoine Pitrou <solipsis@pitrou.net>
> wrote:
> > On Tue, 04 Dec 2012 11:04:09 -0800
> > Larry Hastings <larry@hastings.org> wrote:
> >>
> >> Along these lines, I've been contemplating proposing that Clinic
> >> specifically understand "path" arguments, distinctly from other string
> >> arguments, as they are both common and rarely handled correctly. My
> >> main fear is that I probably don't understand all their complexities
> >> either ;-)
> >>
> >> Anyway, this is certainly something we can consider *improving* for
> >> Python 3.4. But for now I'm trying to make Clinic an indistinguishable
> >> drop-in replacement.
> >>
> > [...]
> >>
> >> Naturally I agree Clinic needs more polishing. But the problem you fear
> >> is already solved. Clinic allows precisely expressing any existing
> >> PyArg_ "format unit"** through a combination of the type of the
> >> parameter and its "flags".
> >
> > Very nice then! Your work is promising, and I hope we'll see a version
> > of it some day in Python 3.4 (or 3.4+k).
>
> +1 for getting this into 3.4. Does it need a PEP, or just a bug
> tracker item + code review? I think the latter is fine -- it's
> probably better not to do too much bikeshedding but just to let Larry
> propose a patch, have it reviewed and submitted, and then iterate.
> It's also okay if it is initially used for only a subset of extension
> modules (and even if some functions/methods can't be expressed using
> it yet).
>

I don't see a need for a PEP either; code review should be plenty since
this doesn't change how the outside world views public APIs. And we can
convert code iteratively so that shouldn't hold things up either.
Re: Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython [ In reply to ]
On Tue, 4 Dec 2012 16:45:54 -0500
Brett Cannon <brett@python.org> wrote:
> >
> > +1 for getting this into 3.4. Does it need a PEP, or just a bug
> > tracker item + code review? I think the latter is fine -- it's
> > probably better not to do too much bikeshedding but just to let Larry
> > propose a patch, have it reviewed and submitted, and then iterate.
> > It's also okay if it is initially used for only a subset of extension
> > modules (and even if some functions/methods can't be expressed using
> > it yet).
> >
>
> I don't see a need for a PEP either; code review should be plenty since
> this doesn't change how the outside world views public APIs. And we can
> convert code iteratively so that shouldn't hold things up either.

I think the DSL itself does warrant public exposure. It will be an
element of the CPython coding style, if its use becomes widespread.

Regards

Antoine.
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com
Re: Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython [ In reply to ]
On Tue, Dec 4, 2012 at 9:29 AM, Larry Hastings <larry@hastings.org> wrote:
> To save you a little time, here's a preview of using Clinic for
> dbm.open(). The stuff at the same indent as a declaration are
> options; see the "clinic.txt" in the repo above for full documentation.
>
> /*[clinic]
>... hand-written content ...
> [clinic]*/
>
> ... generated content ...
> /*[clinic end:eddc886e542945d959b44b483258bf038acf8872]*/
>

One thing I'm not entirely clear on. Do you run Clinic on a source
file and it edits that file, or is it a step in the build process?
Your description of a preprocessor makes me think the latter, but the
style of code (eg the checksum) suggests the former.

ChrisA
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com
Re: Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython [ In reply to ]
On Tue, Dec 4, 2012 at 4:48 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:

> On Tue, 4 Dec 2012 16:45:54 -0500
> Brett Cannon <brett@python.org> wrote:
> > >
> > > +1 for getting this into 3.4. Does it need a PEP, or just a bug
> > > tracker item + code review? I think the latter is fine -- it's
> > > probably better not to do too much bikeshedding but just to let Larry
> > > propose a patch, have it reviewed and submitted, and then iterate.
> > > It's also okay if it is initially used for only a subset of extension
> > > modules (and even if some functions/methods can't be expressed using
> > > it yet).
> > >
> >
> > I don't see a need for a PEP either; code review should be plenty since
> > this doesn't change how the outside world views public APIs. And we can
> > convert code iteratively so that shouldn't hold things up either.
>
> I think the DSL itself does warrant public exposure. It will be an
> element of the CPython coding style, if its use becomes widespread.
>

That's what the issue will tease out, so this isn't going in without some
public scrutiny. But going through python-ideas for this I think is a bit
much. I mean we don't clear every change to PEP 7 or 8 with the public and
that directly affects people as well in terms of coding style.
Re: Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython [ In reply to ]
On Tue, 4 Dec 2012 16:54:27 -0500
Brett Cannon <brett@python.org> wrote:
> On Tue, Dec 4, 2012 at 4:48 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
>
> > On Tue, 4 Dec 2012 16:45:54 -0500
> > Brett Cannon <brett@python.org> wrote:
> > > >
> > > > +1 for getting this into 3.4. Does it need a PEP, or just a bug
> > > > tracker item + code review? I think the latter is fine -- it's
> > > > probably better not to do too much bikeshedding but just to let Larry
> > > > propose a patch, have it reviewed and submitted, and then iterate.
> > > > It's also okay if it is initially used for only a subset of extension
> > > > modules (and even if some functions/methods can't be expressed using
> > > > it yet).
> > > >
> > >
> > > I don't see a need for a PEP either; code review should be plenty since
> > > this doesn't change how the outside world views public APIs. And we can
> > > convert code iteratively so that shouldn't hold things up either.
> >
> > I think the DSL itself does warrant public exposure. It will be an
> > element of the CPython coding style, if its use becomes widespread.
> >
>
> That's what the issue will tease out, so this isn't going in without some
> public scrutiny. But going through python-ideas for this I think is a bit
> much. I mean we don't clear every change to PEP 7 or 8 with the public and
> that directly affects people as well in terms of coding style.

Not necessarily python-ideas, but python-dev.
(I hope we don't need a separate clinic-dev mailing-list, although it
certainly sounds funny)

Regards

Antoine.
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/list-python-dev%40lists.gossamer-threads.com

1 2  View All