9. Errors

Many errors will arise when you develop your Python package. At a mature stage of development the code should be error-free and robust. This means that anyone should expect to be able to use it without encountering errors. To ensure that as you continue developing your package you are not breaking some parts, leading to some errors without you noticing, the best way is to write a test suite.

The test suite is a set of tests that should be run automatically to check every functionality of your package every time you update its distribution.

Before going into the test part, let us recap on the different types of errors you will generally encounter.

9.1. Types of errors in Python

In Python, there are several common built-in exceptions that you’ll frequently encounter and might want to test against. Here are some of the main ones:

ZeroDivisionError: Raised when attempting to divide by zero.

[10]:

result = 10 / 0  # Raises ZeroDivisionError

---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
Cell In [10], line 1
----> 1 result = 10 / 0  # Raises ZeroDivisionError

ZeroDivisionError: division by zero

TypeError: Raised when an operation or function is applied to an object of inappropriate type. For example, trying to add a string to an integer or passing a non-iterable to a function that expects an iterable.

[9]:

result = 'text' + 10  # Raises TypeError

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In [9], line 1
----> 1 result = 'text' + 10  # Raises TypeError

TypeError: can only concatenate str (not "int") to str

ValueError: Raised when a function receives an argument of the correct type but inappropriate value. This could happen, for instance, when trying to convert a non-numeric string to an integer.

[8]:

number = int("abc")  # Raises ValueError

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In [8], line 1
----> 1 number = int("abc")  # Raises ValueError

ValueError: invalid literal for int() with base 10: 'abc'

IndexError: Raised when an index is out of the range of a list, tuple, or other indexable collections.

[7]:

lst = [1, 2, 3]
print(lst[5])  # Raises IndexError

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In [7], line 2
      1 lst = [1, 2, 3]
----> 2 print(lst[5])  # Raises IndexError

IndexError: list index out of range

KeyError: Raised when trying to access a dictionary with a key that doesn’t exist. This is useful for handling cases where a function requires specific dictionary keys.

[6]:

my_dict = {"a": 1}
print(my_dict["b"])  # Raises KeyError

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In [6], line 2
      1 my_dict = {"a": 1}
----> 2 print(my_dict["b"])  # Raises KeyError

KeyError: 'b'

AttributeError: Raised when an invalid attribute is referenced, typically due to accessing an attribute or method that doesn’t exist in an object.

[5]:

class MyClass:
    pass

obj = MyClass()
obj.some_method()  # Raises AttributeError

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In [5], line 5
      2     pass
      4 obj = MyClass()
----> 5 obj.some_method()  # Raises AttributeError

AttributeError: 'MyClass' object has no attribute 'some_method'

FileNotFoundError: Raised when trying to open a file that does not exist. It’s often used in data science to handle cases where file paths are incorrect or files are missing.

[4]:

with open("non_existent_file.txt") as f:
    content = f.read()  # Raises FileNotFoundError

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
Cell In [4], line 1
----> 1 with open("non_existent_file.txt") as f:
      2     content = f.read()  # Raises FileNotFoundError

File ~/opt/miniconda3/lib/python3.9/site-packages/IPython/core/interactiveshell.py:282, in _modified_open(file, *args, **kwargs)
    275 if file in {0, 1, 2}:
    276     raise ValueError(
    277         f"IPython won't let you open fd={file} by default "
    278         "as it is likely to crash IPython. If you know what you are doing, "
    279         "you can use builtins' open."
    280     )
--> 282 return io_open(file, *args, **kwargs)

FileNotFoundError: [Errno 2] No such file or directory: 'non_existent_file.txt'

OverflowError: Raised when a numerical calculation exceeds the maximum limit for a numeric type. This is common in scientific computations where very large numbers are generated.

[2]:

import math
result = math.exp(1000)  # Raises OverflowError on some systems

---------------------------------------------------------------------------
OverflowError                             Traceback (most recent call last)
Cell In [2], line 2
      1 import math
----> 2 result = math.exp(1000)  # Raises OverflowError on some systems

OverflowError: math range error

AssertionError: Raised when an assert statement fails. Useful in testing when specific conditions should be met.

[1]:

assert 2 + 2 == 5  # Raises AssertionError

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
Cell In [1], line 1
----> 1 assert 2 + 2 == 5  # Raises AssertionError

AssertionError:

RuntimeError: A generic error raised when an error occurs that doesn’t fall into other categories. It’s often used in more complex scenarios where exceptions need custom handling.

9.2. Exception handling

These exception allow us to use a very useful feature of Python which is called exception handling.

An example is more useful than words:

[11]:

def divide(a, b):
    try:
        result = a / b
    except ZeroDivisionError:
        print("Error: Cannot divide by zero!")
        return None
    else:
        print("Division successful!")
        return result
    finally:
        print("Execution complete.")

# Example usage
print(divide(10, 2))  # Should print "Division successful!" and the result 5.0
print(divide(10, 0))  # Should print "Error: Cannot divide by zero!" and return None

Division successful!
Execution complete.
5.0
Error: Cannot divide by zero!
Execution complete.
None

Without exception handling, the program would crash. This feature allows you to handle errors gracefully and continue the execution of the program, which can mean simply exiting it but in a smooth manner, and providing a message to the user on what is going wrong.

10. Tests

The goal of the test suite is to test every functionality and part of your package.

As soos as you have finnished implementing a new part of your code, good practice wants you to write a test for it.

The test suite is stored in the tests folder of your package root directory.

10.1. Test suite with pytest

All the files in the tests folder are called test_<name of test>.py where <name of test> should be replaced by the name of the functionality you are testing.

For instance, in our company package, we can create the following test files:

tests/
├── test_base_company.py
├── test_cli.py
└── test_medical.py

The first one tests the base_company.py file (i.e. the Company class and its methods) and the second one tests the medical.py file (i.e. the MedicalCompany class and its methods).

A test file looks contains a set of functions that look like this:

def test_medical_init():
    med_company = MedicalCompany(name="MediCorp", specialty="Cardiology", drug_manufacturer=True)
    assert med_company.name == "MediCorp"
    assert med_company.specialty == "Cardiology"
    assert med_company.drug_manufacturer is True

These functions are all based on the assert statement.

The assert statement is used to check if a condition is true. If the condition is false, an AssertionError is raised.

Of course, you can be as creative as you want with the tests, and as data scientists, you will want loads of quantitative tests.

For example, consider the stock_price_difference function in the cli.py file.

We can write a test for this function as follows (in tests/test_cli.py):

def test_get_stock_price_difference(capsys, monkeypatch):
    # Mock command-line arguments with a known ticker and date range
    monkeypatch.setattr("sys.argv", [
        "cli.py", "get_stock_price_difference",
        "--ticker", "AAPL",
        "--interval", "1y",
        "--stop_date", "2023-12-31"
    ])

    # Run the CLI main function
    main()

    # Capture output
    captured = capsys.readouterr()

    # Test the numeric value directly by extracting it from the output
    # price_diff = float(captured.out.split(": ")[1].strip())
    assert abs(price_diff - 18.717864990234) < 1e-4

Here we know that the value of the stock price difference is 18.717864990234 (at this precision), and we test that the value we get from the function is close enough to this value.

pytest contains a nice feature allowing you to compare floating point numbers with a certain precision, which is pytest.approx. You could replace the last line above by:

assert price_diff == pytest.approx(18.717864990234, rel=1e-4)

To test all entries in an array you can also use the following assertion in a test function:

def test_<my_function_name>():

    ...

    expected_values = np.array([0.          , 3663.04149234, 5618.94079371, 6811.03765429, 7625.75439281,
                               8226.01526502, 8691.41376217, 9065.71293446, 9375.23339903, 9636.58188782])

    result = <my_function_name output array>

    np.testing.assert_allclose(result, expected_values, rtol=1e-5)

10.2. Additional features

pytest has a lot of additional features that you can use to make your life easier.

For instance, you can use the monkeypatch fixture to mock objects or functions, or the capsys fixture to capture the output (i.e., what is stored in stdout) of the print statements of your functions.

We have created an example for this in the test_medical.py file.

10.3. Running the test suite

To run the test suite, go to the root directory of your package and run:

pytest -s tests/*

to run all the tests in the tests folder.

Here the -s option is used to show the output of the print statements in your test files on the terminal. Without this option the print statements are automatically suppressed.

If you want to run a single test, you can use the following command:

pytest tests/test_<name of test>.py

When a test runs well you would see something like this:

================================================== test session starts ==================================================
platform darwin -- Python 3.9.13, pytest-7.2.0, pluggy-1.0.0
rootdir: /Users/boris/MPhil/company_package
plugins: cov-4.1.0, anyio-3.6.2
collecting ... Company package version: 0.0.0b1.dev8+g5c0d18a.d20241030
collected 8 items

tests/test_base_company.py ....
tests/test_cli.py ..
tests/test_medical.py ..

=================================================== 8 passed in 0.85s ===================================================

When a test fails you would see something like this (here we artificially made a test fail by changing the expected value of stock price difference):

================================================== test session starts ==================================================
platform darwin -- Python 3.9.13, pytest-7.2.0, pluggy-1.0.0
rootdir: /Users/boris/MPhil/company_package
plugins: cov-4.1.0, anyio-3.6.2
collecting ... Company package version: 0.0.0b1.dev8+g5c0d18a.d20241030
collected 8 items

tests/test_base_company.py ....
tests/test_cli.py .F
tests/test_medical.py ..

======================================================= FAILURES ========================================================
____________________________________________ test_get_stock_price_difference ____________________________________________

capsys = <_pytest.capture.CaptureFixture object at 0x134d294c0>
monkeypatch = <_pytest.monkeypatch.MonkeyPatch object at 0x134d297c0>

    def test_get_stock_price_difference(capsys, monkeypatch):
        # Mock command-line arguments with a known ticker and date range
        monkeypatch.setattr("sys.argv", [
            "cli.py", "get_stock_price_difference",
            "--ticker", "AAPL",
            "--interval", "1y",
            "--stop_date", "2023-12-31"
        ])

        # Run the CLI main function
        main()

        # Capture output
        captured = capsys.readouterr()

        # # Test the numeric value directly by extracting it from the output
        price_diff = float(captured.out.split(": ")[1].strip())
        # assert abs(price_diff - 18.717864990234) < 1e-4


        # Test using pytest.approx for better floating point comparison
>       assert price_diff == pytest.approx(19.717864990234, rel=1e-4)
E       assert 18.717864990234375 == 19.717864990234 ± 2.0e-03
E         comparison failed
E         Obtained: 18.717864990234375
E         Expected: 19.717864990234 ± 2.0e-03

tests/test_cli.py:42: AssertionError
================================================ short test summary info ================================================
FAILED tests/test_cli.py::test_get_stock_price_difference - assert 18.717864990234375 == 19.717864990234 ± 2.0e-03
============================================== 1 failed, 7 passed in 0.89s ==============================================