Discussion:
main() -> Py_SetProgramName()
Barry Warsaw
2012-12-05 00:12:01 UTC
Permalink
One gotcha with porting embedded Python 3 is the mismatch between main()'s
signature and Py_SetProgramName() and PySys_SetArgv().

In Python 2, everything was easy. You got char*'s from main() and could pass
them directly to these two calls. Not in Python 3, because they now take
wchar_t*'s instead. I get why these signatures have changed, but that doesn't
make life very easy for porters.

Take a look at main() in Modules/python.c to see the headaches Python itself
goes through do the conversions. I think we're doing a disservice to
embedders not to provide convenience functions, alternative APIs, or at the
very least code examples for helping them do the argument conversions. This
is not easy code, it's error prone, and folks shouldn't have to roll their own
every time they need to do this.

Using the algorithm in main() is probably not the best recommendation either,
because it uses non-public API methods such as _Py_char2wchar(). Perhaps
these should be promoted to a public method, or we should add a method to get
from main()'s char** to a wchar_t**.

For now, I've tried to use mbsrtowcs(), though I haven't done extensive
testing on the code. I think Python ultimately uses mbstowcs() down deep in
its bowels.

There was some discussion of this back in 2009 IIRC, but nothing ever came of
it. I think MvL at the time was against adding any convenience or alternative
API to Python.

Has anybody else encountered this while porting embedded Python applications
to Python 3? How did you solve it?

I'm happy to bring this up on python-dev, but I also don't want to have to
wait until Python 3.4 to have a nice solution.

Cheers,
-Barry
M.-A. Lemburg
2012-12-05 08:16:44 UTC
Permalink
Post by Barry Warsaw
One gotcha with porting embedded Python 3 is the mismatch between main()'s
signature and Py_SetProgramName() and PySys_SetArgv().
In Python 2, everything was easy. You got char*'s from main() and could pass
them directly to these two calls. Not in Python 3, because they now take
wchar_t*'s instead. I get why these signatures have changed, but that doesn't
make life very easy for porters.
Take a look at main() in Modules/python.c to see the headaches Python itself
goes through do the conversions. I think we're doing a disservice to
embedders not to provide convenience functions, alternative APIs, or at the
very least code examples for helping them do the argument conversions. This
is not easy code, it's error prone, and folks shouldn't have to roll their own
every time they need to do this.
Using the algorithm in main() is probably not the best recommendation either,
because it uses non-public API methods such as _Py_char2wchar(). Perhaps
these should be promoted to a public method, or we should add a method to get
from main()'s char** to a wchar_t**.
For now, I've tried to use mbsrtowcs(), though I haven't done extensive
testing on the code. I think Python ultimately uses mbstowcs() down deep in
its bowels.
There's also another issue with the approach, since changing the
**argv from within Python is no longer possible on non-Windows
platforms.

This doesn't only affect embedded uses of Python, but all other
uses as well, e.g. it's no longer possible to change the ps output
under Unix for daemons and the like.

I think that we should have APIs going from the original char **argv
to the Py_Main() wchar_t **argv one, as well as APIs that allow
changing or at least accessing the original char **argv from within
Python (on non-Windows platforms).

That said, I don't think this is going to happen in a patch level
release...
--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, Dec 05 2012)
Post by Barry Warsaw
Python Projects, Consulting and Support ... http://www.egenix.com/
mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/
mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________
2012-11-28: Released eGenix mx Base 3.2.5 ... http://egenix.com/go36
2013-01-22: Python Meeting Duesseldorf ... 48 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::

eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/
Antoine Pitrou
2012-12-08 09:40:32 UTC
Permalink
Post by M.-A. Lemburg
There's also another issue with the approach, since changing the
**argv from within Python is no longer possible on non-Windows
platforms.
This doesn't only affect embedded uses of Python, but all other
uses as well, e.g. it's no longer possible to change the ps output
under Unix for daemons and the like.
setproctitle is your friend:
http://pypi.python.org/pypi/setproctitle

Regards

Antoine.
M.-A. Lemburg
2012-12-08 11:57:53 UTC
Permalink
Post by Antoine Pitrou
Post by M.-A. Lemburg
There's also another issue with the approach, since changing the
**argv from within Python is no longer possible on non-Windows
platforms.
This doesn't only affect embedded uses of Python, but all other
uses as well, e.g. it's no longer possible to change the ps output
under Unix for daemons and the like.
http://pypi.python.org/pypi/setproctitle
Thanks for the pointer, but I think this is more than enough
proof that something should be done to make the situations in
Py3 easier for everyone.

Here's the hack he's using to find the original argv areas
by walking backwards from environ[0]...

https://github.com/dvarrazzo/py-setproctitle/blob/master/src/spt_setup.c#L139
--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, Dec 08 2012)
Post by Antoine Pitrou
Post by M.-A. Lemburg
Python Projects, Consulting and Support ... http://www.egenix.com/
mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/
mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________
2012-12-05: Released eGenix pyOpenSSL 0.13 ... http://egenix.com/go37
2012-11-28: Released eGenix mx Base 3.2.5 ... http://egenix.com/go36
2013-01-22: Python Meeting Duesseldorf ... 45 days to go

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/
Martin v. Löwis
2012-12-05 12:53:01 UTC
Permalink
Post by Barry Warsaw
Using the algorithm in main() is probably not the best
recommendation either,
Post by Barry Warsaw
because it uses non-public API methods such as _Py_char2wchar(). Perhaps
these should be promoted to a public method, or we should add a method to get
from main()'s char** to a wchar_t**.
For now, I've tried to use mbsrtowcs(), though I haven't done extensive
testing on the code. I think Python ultimately uses mbstowcs() down deep in
its bowels.
There was some discussion of this back in 2009 IIRC, but nothing ever came of
it. I think MvL at the time was against adding any convenience or alternative
API to Python.
If I said that, I may not have meant it this way. I may have been
opposed to a convenience function that implicitly calls setlocale, which
in turn would be necessary before mbsrtowcs can do anything useful
(for non-ASCII characters).
Post by Barry Warsaw
I'm happy to bring this up on python-dev, but I also don't want to have to
wait until Python 3.4 to have a nice solution.
In which case a stand-alone convenience function could be provided, to
be included in every project facing this issue.

Regards,
Martin
Loading...