Don't expect os.environ.pop to do anything meaningful

It doesn't actually remove things from the environment, just from its local copy of the environment. The Python developers seem to simply have left it out of their implementation in os.py. Gustavo Niemeyer noticed this one.

>>> import os
>>> os.system("echo $ASD")

0
>>> os.environ["ASD"] = "asd"
>>> os.system("echo $ASD")
asd
0
>>> os.environ.pop("ASD")
'asd'
>>> os.system("echo $ASD")
asd
0

5 comments:

Graham Dumpleton said...

It gets worse than that. One of the problems with os.environ is that it is a copy of what was the set of C API environment variables at the time the interpreter was created. In an embedded system where distinct C code may have used C API putenv() to set an environment variable, this will never be visible to Python code if it is done after the Python interpreter was created. In this situation, you are forced to create a special extension module just to get access to the C API getenv() to see the environment variable set by the distinct C code.

Another problem is where in an embedded system multiple Python sub interpreters are created, but not all at the same time. The issue here is that when you update from Python code values in os.environ, it will automatically call C API putenv() to propagate the value to the set of C environment variables. As per above, this will not be visible to other existing Python sub interpreters in the same process. This is generally fine as the whole point of using multiple Python sub interpreters in a process is to maintain some separation and not have the code in each interfere with each other.

If now though a subsequent Python sub interpreter is created, it will inherit the value from that newly set environment variable, thus effectively resulting in leakage of updates to os.environ from one Python sub interpreter to another over time.

Most people wouldn't be doing such complicated stuff themselves, but it does raise itself as a problem when using mod_python and mod_wsgi as use of multiple Python sub interpreters is something both do. This especially can be a problem where web frameworks based their configuration on environment variables as distinct instances of the web application running in separate Python sub interpreters can interfere with each other.

In short, in web applications at least, use of environment variables is not a good idea if you want to host it in conjunction with Apache using packages like mod_python or mod_wsgi.

Christopher Armstrong said...

Thanks for the comment. FYI, you're not required to create your own special extension module to get access getenv, because there's already os.getenv.


I've never heard of anyone instantiating multiple Python interpreters in one process. I thought CPython had way too much global state for that. Is that actually possible? I was under the impression that mod_python and mod_wsgi just use regular threads or processes.

Graham Dumpleton said...

Using os.getenv() only returns values from os.environ, ie. the copy of C environment variables. It does not call the actual C getenv(). Thus, if you delete a value from os.environ, if os.getenv() really called C getenv(), then you should still be able to get the value back, as deleting it from os.environ doesn't cause it to be deleted from C environment as you point out. This doesn't happen though as it returns None instead.

If you look at os.py you will see how it supplies getenv(), mapping it to os.environ.get() instead.

Both mod_python and mod_wsgi use combination of process, threads and multiple interpreters.

Morrighu Tel Uvrith said...

Hi,

We’re an open source gaming engine project that’s looking for python devs. If you or anyone you know is interested, please visit us at www.projectangela.org.

TIA,

M.

pomke said...

Hi Morrighu Tel Uvrith, I would not work for your company out of principle, even if you paid me in tim-tams vendored by nubile Japanese princesses.