'Second order import issue in python: What is going on?
I have a rather simple setup:
[FOLDER]
|-> [Lib]
__init__.py (__all__=["modA","modB"])
modA.py (contains class named classA)
modB.py (contains class named classB + from modA import classA)
test1.py (from Lib.modA import classA
from Lib.modB import classB)
|-> [example]
test2.py (import sys
sys.path.append("../")
from Lib.modA import classA
from Lib.modB import classB)
Running test1.py from the Lib folder works perfectly (as in no warnings/errors/etc). Running test2.py from the example folder on the other hand requires the sys-patch due to the weird way python works, and crashes with "No module named modA" tracing back from the from modA import classA(in modB.py) via from Lib.modB import classB(in test2.py).
How is one supposed to define an import in a module such that it will also work irrespective of the possible location of any future script that may use/import said module?
Solution 1:[1]
Python programs should be thought of as packages and modules, not as directories and files. While there is some overlap, packages and modules are more restrictive but also better encapsulated as a result.
Mixing both – say by manually modifying sys.path – should only be done as a last resort.
TLDR:
- Use fully qualified imports:
from Lib.modA import classAinstead offrom modA import classA. - Use the environment for discovery: Add search paths via
PYTHONPATHinstead ofsys.path.
Start by deciding which ones are the top-level packages.
This is the point where we go from "directories/files" to "package". Notably, we cannot go "above" the top-level later on, so it should contain everything we need. However, we cannot remove anything "below" either, so it should be a tight enough selection.
In the example we could go as low as treating modA and siblings as their own module-package, and as high as FOLDER encompassing the entire project.
[FOLDER]
|-> [Lib]
| |-> modA.py
: :
It is reasonable to pick Lib since it represents a self-contained part. Going higher to FOLDER would be excessive, going lower to modA and siblings would break their relation as belonging together.
Everything under the top-level package folder now belongs to the package.
Prominently, modA.py is now the module Lib.modA. It is not FOLDER.Lib.modA, nor just modA.
Use only absolute fully qualified names or relative names for import.
Now that the top-level is established, all import statements must be made with regard to it. Refer to modules by their fully qualified name only, even when two modules share a more common parent:
# Lib.modB
# okay, fully qualified import
from Lib.modA import classA
# broken, unqualified import
from modA import classA
In order to avoid typing out the entire fully qualified name, one can use a relative name instead. This reuses the current module name to derive the fully qualified name of the module to import.
# Lib.modB
# okay, relative import
from .modA import classA
Note: Relative imports are a package operation, not a filesystem operation. One cannot go beyond the top-level, but descend into package namespaces.
Everything inside the module is now encapsulated and self-contained.
All internal imports will work irrespective of the location of the package itself. As long as the top-level can be imported, everything below it can be imported as well.
Notably, this works irrespective of the location of any future script that may use/import the package.
Enable packages via the environment, not the program.
The point of defining a package is getting a single, self-contained entity that represents a library. One could zip up a package or similar, and it would still be a package.
Instead of having scripts assume the location of the package, the environment - of the script, the user, or the entire machine – should expose the package. There are basically two ways to do this:
- Announce the package location as a search location. This is suitable for development as it is flexible but hard to maintain. The
PYTHONPATHis appropriate for this. - Move the package to a search location used by Python. This is suitable for production and distribution as it requires effort but is easy to maintain. Packaging and using a package manager is appropriate for this.
Solution 2:[2]
To what I understand, when you are trying to run manually there is no path set as the environment, and when you just do
python file.py
python doesn't know in which environment we are running, which is precisely why, we do sys.path.append, to add the sibling directory path to env.
In the case if you are using an IDE like spyder or pycharm. you have an option to set the directory in which your program is located, or in pycharm it creates the whole program as a package where all the paths are handled by IDE.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | MisterMiyagi |
| Solution 2 | vjspranav |
