8  Importing Code

Every Python installation includes a number of modules by default. They are called collectively the “Python Standard Library” or “built-in” modules.

Even though these modules are available in our computer, we cannot directly use them, we need to import them into our current namespace to actually use them.

For example, if we try to use the module pathlib, we get this NameError

pathlib
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[2], line 1
----> 1 pathlib

NameError: name 'pathlib' is not defined

That means that the word pathlib is unknown in this context (namespace).
Let’s bring it in:

import pathlib

Now we can use it:

pathlib.Path(".")
PosixPath('.')

We can also import only specific parts of the module (submodules):

from pathlib import Path

It is also possible to alias these imported names using the keyword as:

import pathlib as pl
pl.Path(".")
PosixPath('.')

8.1 Importable Code vs. Scripting

We can also import our own code. For example, if we have a python file called mymodule.py we can use it in another file:

# anotherfile.py

import mymodule

Sometimes we have a Python module (any .py file) that contain code that we only want to run as a script alongside code that we only want to use somewhere else (importing parts of it).

We can isolate part of the code that we want to run as a script with a special syntax, let’s do some exercises to show it.

8.2 Exercises

Create 2 files like these in our working project /pycourse/src/pycourse/data.py, /pycourse/src/pycourse/preprocessing.py:

# data.py

RAWDATA = ...  # 3 dots are called "Ellipsis" and act as placeholder

def fetch_data(raw_data):
    ... 

# This block will only run when calling `python data.py`
if __name__ == "__main__":
    print("Fetching data")    
    data = fetch_data(RAWDATA)
    print(data)
# preprocessing.py


# TODO: add missing imports to make it work

def cleanup(data):
    ... 
    
def preprocess_pipeline(raw_data):
    data = fetch_data(raw_data)
    return cleanup(data)
     
if __name__ == "__main__":
    print("Running preprocessing")    
    raw_data = ...
    preprocess_pipeline(raw_data)
  1. Copy the RAWDATA from the previous lab exercises into a file data.py
  2. Create another file called preprocessing.py in the same directory.
  3. Import the RAWDATA string inside preprocessing.py
  4. Run data.py as a script with uv run src/pycourse/data.py. Pay attention to the printed output
  5. Run preprocessing as a script with uv run src/pycourse/preprocessing.py. Pay attention to the printed output. What difference do you see compared to the previous point?

Congratulations! You just ran your first python scripts 🚀