Stefan Behnel | 10 May 15:51 2011

Re: Re: automatic character conversion problem

Hans Terlouw, 10.05.2011 14:41:
> Thank you for your explanation. I understand what happens. This
> behaviour is however different from previous versions of Cython and
> also from what I found in Cython's documentation. E.g. in your Cython
> Tutorial (S. Behnel, R. Bradshaw, D. Seljebotn: PREPRINT, submitted to
> Proc. SciPy 2009, pp. 1–12) I read:
>    "It is, however, very easy to pass byte strings between C
>     code and Python. When receiving a byte string from a
>     C library, you can let Cython convert it into a Python
>     byte string by simply assigning it to a Python variable:
>       cdef char* c_string = c_call_returning_a_c_string()
>       py_string = c_string

That's no longer correct. It's also not very explicit code. You can see 
that when you change the names:

   cdef char* c_string = c_call_returning_a_c_string()
   c_string_ptr = c_string

Now, you want "c_string_ptr" to become a Python variable here? IMHO, that's 
less obvious than the simple assignment that now happens.

> Now I see a value corresponding to "c_str_a" _after_ the last call to
> strcpy(). I find this confusing. In my case the effect of this change
> is that code that worked previously now fails in strange ways. Mainly
> due to now prematurely free'd char pointers.

They are not being freed in your example code, but I know what you mean. 
It's certainly not unlikely for code to free the memory buffer right after 
the (previously converting) assignment.

> I am quite capable to change my code so that the problem is avoided,
> and I will, but I think this may be also be a pitfall for others. In
> the newer documentation I found examples with a "cdef bytes"
> declaration. If this is used, the problem also doesn't occur.

Yep that's the one way to do it now. Explict and safe. You can also use 
"cdef object" or a cast to a Python object type.

There is also an explicit section in the 0.13 release notes on this:

> Perhaps
> it's an idea to do this declaration implicitly whenever a C string is
> assigned to a Python variable.

An untyped variable, you mean. That would be a special case only for char* 
values. It's backwards compatible, sure, but IMHO more surprising.