Stefan Behnel | 29 Apr 07:11 2012
Picon

Re: [Cython] Wacky idea: proper macros

Wes McKinney, 29.04.2012 03:14:
> On Sat, Apr 28, 2012 at 5:25 PM, mark florisson wrote:
>> On 28 April 2012 22:04, Nathaniel Smith wrote:
>>> Was chatting with Wes today about the usual problem many of us have
>>> encountered with needing to use some sort of templating system to
>>> generate code handling multiple types, operations, etc., and a wacky
>>> idea occurred to me. So I thought I'd through it out here.
>>>
>>> What if we added a simple macro facility to Cython, that worked at the
>>> AST level? (I.e. I'm talking lisp-style macros, *not* C-style macros.)
>>> Basically some way to write arbitrary Python code into a .pyx file
>>> that gets executed at compile time and can transform the AST, plus
>>> some nice convenience APIs for simple transformations.
>>>
>>> E.g., if we steal the illegal token sequence  <at>  <at>  as our marker, we
>>> could have something like:
>>>
>>>  <at>  <at>  # alone on a line, starts a block of Python code
>>> from Cython.MacroUtil import replace_ctype
>>> def expand_types(placeholder, typelist):
>>>  def my_decorator(function_name, ast):
>>>    functions = {}
>>>    for typename in typelist:
>>>      new_name = "%s_%s" % (function_name, typename)
>>>      functions[name] = replace_ctype(ast, placeholder, typename)
>>>    return functions
>>>  return function_decorator
>>>  <at>  <at>  # this token sequence cannot occur in Python, so it's a safe end-marker
>>>
>>> # Compile-time function decorator
>>> # Results in two cdef functions named sum_double and sum_int
>>>  <at>  <at> expand_types("T", ["double", "int"])
>>> cdef T sum(np.ndarray[T] arr):
>>>  cdef T start = 0;
>>>  for i in range(arr.size):
>>>    start += arr[i]
>>>  return start
>>>
>>> I don't know if this is a good idea, but it seems like it'd be very
>>> easy to do on the Cython side, fairly clean, and be dramatically less
>>> horrible than all the ad-hoc templating stuff people do now.
>>> Presumably there'd be strict limits on how much backwards
>>> compatibility we'd be willing to guarantee for code that went poking
>>> around in the AST by hand, but a small handful of functions like my
>>> notional "replace_ctype" would go a long way, and wouldn't impose much
>>> of a compatibility burden.
>>
>> Have you looked at http://wiki.cython.org/enhancements/metaprogramming ?
>>
>> In general I would like better meta-programming support, maybe even
>> allow defining new operators (although I'm not sure any of it is very
>> pythonic), but for templates I think fused types should be used, or
>> improved when they fall short. Maybe a plugin system could also help
>> people.
> 
> I referenced this problem recently in a blog post
> (http://wesmckinney.com/blog/?p=467). My main interest these days is
> in expressing data algorithms. I've unfortunately found myself working
> around performance problems with fundamental array operations in NumPy
> so a lot of the Cython work I've done has been in and around this. In
> lieu of some kind of macro system it seems inevitable that I'm going
> to need to create some kind of mini array language or otherwise code
> generation framework (targeting C, Cython, or Fortran). I worry that
> this is going to end with me creating "yet another APL [or Haskell]
> implementation"  but I really need something that runs inside CPython.

Generally speaking, it's always better to collect and describe use cases
first before adding a language feature, especially one that is as complex
and far reaching as this.

It might well be that fused types can (be made to) work for them, and it
might be that a (non AST based) preprocessor step would work. Keeping
metaprogramming facilities out of the compiler makes it both more generally
versatile (and easier to replace or disable) and keeps both sides simpler.

> And why not? Most of these algorithms could be expressed at a very
> high level and lead to pretty clean generated C with many of the
> special cases (contiguous memory, or low dimensions in the case of
> n-dimensional algorithms) checked and handled in simplified loops.

That sounds like what you want is a preprocessor that spells out NumPy
array configuration options into separate code paths. However, I'm not sure
a generic approach would work well (enough?) here. And it also doesn't
sound like this needs to be done inside of the compiler. It should be
possible to build that on top of fused types as a separate preprocessor.

Stefan

Gmane