{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Python Package: Initialization and Structure\n", "\n", "\n", "Building your own Python package is easy. \n", "\n", "You need to:\n", "\n", "- make a folder where you will store your package\n", "- create the relevant configuration files\n", "- write the code you want your package to contain\n", "- `pip` install it\n", "\n", "We go over these steps here, and then you should **practice**.\n", "\n", "\n", "## What is a Python package?\n", "\n", "\n", "A Python package is an ensemble of functions that serve a specific purpose.\n", "\n", "The library of functions is often split into multiple files, containing subset of functions, the split follows common sense. \n", "Each subset/file is often referred to as a *module*. \n", "\n", "An example of a well-maintained package is [getdist](https://github.com/cmbant/getdist).\n", "\n", "
\n", "**Exercise:** Click on [getdist](https://github.com/cmbant/getdist) and look inside the repository. \n", "Identify the configuration files and modules. \n", "
\n", "\n", "\n", "## The folder structure of a Python package\n", "\n", "\n", "### Minimal package structure\n", "\n", "The structure is simple. A minimal package file structure would look like this:\n", "\n", "```bash\n", ".\n", "├── pyproject.toml # Configuration \n", "├── README.md # Instructions\n", "├── package_name/ # Package folder with codes\n", "│ ├── __init__.py\n", "│ ├── module1.py\n", "│ ├── module2.py\n", "│ └── module3.py \n", "├── dist/ # Distribution files\n", "├── docs/ # Documentation files\n", "└── tests/ # Test files\n", "```\n", "\n", "We will cover `dist` `test` and `docs` later. And you can ignore this for now and focus on the rest.\n", "\n", "
\n", "**Exercise:** Create a folder called `my_package` and create the minimal file structure above with a simple `module1.py` file.\n", "The `module1.py` file should contain a simple function `print_name` that prints \"Hello, ``!\".\n", "\n", "So, you should be able to run:\n", "\n", "```bash\n", "pip install -e .\n", "```\n", "from inside the package folder to install it.\n", "\n", "Then in a python session you should be able to run:\n", "\n", "```python\n", "import package_name as mypkg\n", "mypkg.print_name()\n", "```\n", "\n", "and it should print \"Hello, ``!\".\n", "\n", "**Tip:** Use Google and ChatGPT to help you.\n", "\n", "
\n", "\n", "\n", "### Full package structure\n", "\n", "As serious package developers, you need to be organized. Here is the full package structure you should follow from now on:\n", "\n", "```bash\n", "my_package/\n", "├── .gitignore # Git ignore file for unnecessary files\n", "├── .readthedocs.yaml # Configuration for Read the Docs\n", "├── README.md # Project overview and instructions\n", "├── pyproject.toml # Project configuration, dependencies, and build settings\n", "├── package_name/ # Source code directory\n", "│ ├── __init__.py # Init file\n", "│ ├── base.py # Base classes and functions\n", "│ ├── version.py # Dynamic version handling for the package\n", "│ ├── sub_module1/ # Sub module 1\n", "│ │ ├── __init__.py # Init file\n", "│ │ ├── module1.py # Sub module 1 main module\n", "│ │ └── module1_functions.py # Sub module 1 functions\n", "│ └── sub_module2/ # Sub module 2\n", "│ ├── __init__.py # Init file\n", "│ └── module2.py # Sub module 2 main module\n", "├── dist/ # Distribution files\n", "├── docs/ # Sphinx documentation directory\n", "└── tests/ # Test suite directory\n", "\n", "```\n", "\n", "Note the two hidden files:\n", "\n", "- `.gitignore` to tell `git` which files to ignore\n", "- `.readthedocs.yaml` to tell [Read the Docs](https://readthedocs.org/) how to build the documentation\n", "\n", "Importantly, submodules have their own `__init__.py` file and are stored in their own folder.\n", "\n", "
\n", "**Exercise:** Create an account on [Read the Docs](https://readthedocs.org/), you will need it.\n", "
\n", "\n", "\n", "\n", "\n", "## Working example\n", "\n", "We now go through a working example of a package that deals with companies.\n", "It will shows you the crucial steps of building a package following good practices. \n", "\n", "The package is available [here](https://github.com/borisbolliet/company_package) on GitHub.\n", "\n", "\n", "### README.md\n", "\n", "`README.md` is a markdown file that instructs users on what the package is about, how to install and use it.\n", " \n", "\n", "```plaintext\n", "# Company Package\n", "\n", "\n", "\n", "## Features\n", "\n", "\n", "\n", "## Installation\n", "\n", "\n", "\n", "## Usage\n", "\n", "\n", "\n", "## Documentation\n", "\n", "Link to the [documentation page](https://your-readthedocs-url-here).\n", "\n", "## Contributing\n", "\n", "Contributions via pull requests are welcome! \n", "\n", "## License\n", "\n", "\n", "```\n", "\n", "### Core module\n", "\n", "The core module contains the base class and functions. \n", "\n", "In our example, it is:\n", "\n", "```bash\n", "company_package/\n", "└── company/ \n", " │ ├── __init__.py # Init file for the 'companies' package\n", " │ ├── base_company.py # Base module containing the main `Company` class\n", " │ └── version.py # Dynamic version handling for the package\n", "```\n", "\n", "Look at the files [here](https://github.com/borisbolliet/company_package).\n", "\n", "Note that the files `__init__.py` and `version.py` are required, and \n", "their names are always this. \n", "\n", "`__init__.py` is what turns your code into a package. \n", "\n", "`version.py` sets-up the version number. \n", "\n", "\n", "Most importantly, the `base_company.py` file contains the core code.\n", "\n", "It usually starts with some imports of external packages (the `dependencies` in the `pyproject.toml` file) that are needed for the package to work.\n", "\n", "```python\n", "import yfinance as yf\n", "import pandas as pd\n", "```\n", "\n", "Only import what you need in each file (i.e., if a package is only used in one file, only import it in that file).\n", "\n", "### pyproject.toml\n", "\n", "`pyproject.toml` is the configuration file for the package. \n", "\n", "Here is what it should look like:\n", "\n", "```toml\n", "[build-system]\n", "requires = [\"setuptools\", \"wheel\", \"setuptools_scm\"] # Build requirements\n", "build-backend = \"setuptools.build_meta\"\n", "\n", "[project]\n", "name = \"company\" # name of the package must match the core folder name\n", "dynamic = [\"version\"]\n", "description = \"A Python package for modeling companies across various sectors.\"\n", "readme = \"README.md\"\n", "requires-python = \">=3.9\"\n", "license = { file = \"LICENSE\" }\n", "authors = [\n", " { name = \"Your Name\", email = \"your.email@example.com\" },\n", " { name = \"Boris\", email = \"boris.bolliet@gmail.com\" }\n", "]\n", "keywords = [\"companies\", \"finance\", \"healthcare\", \"technology\"]\n", "classifiers = [\n", " \"Development Status :: 4 - Beta\",\n", " \"Intended Audience :: Information Technology\",\n", " \"License :: OSI Approved :: MIT License\",\n", " \"Programming Language :: Python :: 3\",\n", " \"Topic :: Software Development :: Libraries\"\n", "]\n", "\n", "# Runtime dependencies\n", "dependencies = [\n", " \"numpy\",\n", " \"pandas\",\n", " \"yfinance\",\n", "]\n", "\n", "[project.urls]\n", "\"Documentation\" = \"https://your-readthedocs-url-here\"\n", "\"Source\" = \"https://github.com/yourusername/companies_package\"\n", "\"Issues\" = \"https://github.com/yourusername/companies_package/issues\"\n", "\n", "\n", "[tool.setuptools_scm]\n", "write_to = \"company/version.py\" # Where to write the dynamic version\n", "\n", "[tool.setuptools.packages.find]\n", "where = [\".\"]\n", "```\n", "\n", "For further details, see [here](https://packaging.python.org/en/latest/guides/writing-pyproject-toml/#a-full-example).\n", "\n", "\n", "The list of approved classifiers is available [here](https://pypi.org/classifiers/). It tells you what to put in the `classifiers` field of the `pyproject.toml` file.\n", "\n", "### Package initialisation\n", "\n", "To setup the dynamic versioning and development workflow.\n", "\n", "From inside the package folder, in a terminal, we start with:\n", "\n", "```bash\n", "git init\n", "git add .\n", "git commit -m \"Initial commit\"\n", "git tag 0.0.0beta0\n", "```\n", "\n", "This step is very important for the versioning to work with `setuptools_scm`.\n", "\n", "We can now install the package in development mode with:\n", "\n", "```bash\n", "pip install -e .\n", "```\n", "\n", "We can now import the package in a python session, or notebook:\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import company as cp" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['Company',\n", " 'MedicalCompany',\n", " '__builtins__',\n", " '__cached__',\n", " '__doc__',\n", " '__file__',\n", " '__loader__',\n", " '__name__',\n", " '__package__',\n", " '__path__',\n", " '__spec__',\n", " '__version__',\n", " 'base_company',\n", " 'medical',\n", " 'version']" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dir(cp)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'0.0.post3'" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cp.__version__" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['/Users/boris/Desktop/company_package/company']" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cp.__path__\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can then try the package out, testing some of the \n", "methods of the base class. \n", "\n", "First, we create a company instance:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "my_company = cp.Company(name=\"Nvidia\", ticker=\"NVDA\")" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['__class__',\n", " '__delattr__',\n", " '__dict__',\n", " '__dir__',\n", " '__doc__',\n", " '__eq__',\n", " '__format__',\n", " '__ge__',\n", " '__getattribute__',\n", " '__getstate__',\n", " '__gt__',\n", " '__hash__',\n", " '__init__',\n", " '__init_subclass__',\n", " '__le__',\n", " '__lt__',\n", " '__module__',\n", " '__ne__',\n", " '__new__',\n", " '__reduce__',\n", " '__reduce_ex__',\n", " '__repr__',\n", " '__setattr__',\n", " '__sizeof__',\n", " '__str__',\n", " '__subclasshook__',\n", " '__weakref__',\n", " 'display_info',\n", " 'get_stock_info',\n", " 'get_yfinance_status',\n", " 'name',\n", " 'summarize_activity',\n", " 'ticker']" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dir(my_company)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'Nvidia'" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "my_company.name" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then we can test the display method:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Company Name: Nvidia\n", "Ticker Symbol is: NVDA\n" ] } ], "source": [ "my_company.display_info()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Or the availability of the stock data:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'Available on yfinance'" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "my_company.get_yfinance_status()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And if it is available, we can get its stock history:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
OpenHighLowCloseVolumeDividendsStock Splits
Date
2025-09-23 00:00:00-04:00181.970001182.419998176.210007178.4299931925596000.00.0
2025-09-24 00:00:00-04:00179.770004179.779999175.399994176.9700011435641000.00.0
2025-09-25 00:00:00-04:00174.479996180.259995173.130005177.6900021915867000.00.0
2025-09-26 00:00:00-04:00178.169998179.770004174.929993178.1900021485737000.00.0
2025-09-29 00:00:00-04:00180.429993184.000000180.320007181.8500061930635000.00.0
\n", "
" ], "text/plain": [ " Open High Low Close \\\n", "Date \n", "2025-09-23 00:00:00-04:00 181.970001 182.419998 176.210007 178.429993 \n", "2025-09-24 00:00:00-04:00 179.770004 179.779999 175.399994 176.970001 \n", "2025-09-25 00:00:00-04:00 174.479996 180.259995 173.130005 177.690002 \n", "2025-09-26 00:00:00-04:00 178.169998 179.770004 174.929993 178.190002 \n", "2025-09-29 00:00:00-04:00 180.429993 184.000000 180.320007 181.850006 \n", "\n", " Volume Dividends Stock Splits \n", "Date \n", "2025-09-23 00:00:00-04:00 192559600 0.0 0.0 \n", "2025-09-24 00:00:00-04:00 143564100 0.0 0.0 \n", "2025-09-25 00:00:00-04:00 191586700 0.0 0.0 \n", "2025-09-26 00:00:00-04:00 148573700 0.0 0.0 \n", "2025-09-29 00:00:00-04:00 193063500 0.0 0.0 " ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "stock_history = my_company.get_stock_info(period=\"1mo\")\n", "stock_history.head()\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Package development\n", "\n", "With this good starting point, we can now start developing the package. \n", "\n", "\n", "Let us do an example and implement a new class `MedicalCompany` that inherits from the base `Company` class.\n", "\n", "We create a submodule `medical`, as a folder `medical`, with an `__init__.py` file and a `medical.py` file.\n", "\n", "So, our tree structure now looks like this:\n", "\n", "```bash\n", "company_package/\n", "├── README.md\n", "├── company\n", "│   ├── __init__.py\n", "│   ├── base_company.py\n", "│ ├── version.py\n", "│   ├── medical\n", "│   │   ├── __init__.py\n", "│   │   └── medical.py\n", "```\n", "\n", "#### Relative imports\n", "\n", "\n", "The core code of our package defines some class in the `base_company.py` file.\n", "\n", "In the submodule, we define derived classes that inherit from the base class.\n", "\n", "First, at the top of the `medical.py` file, we import the base class, and everything else we need:\n", "\n", "```python\n", "import pandas as pd\n", "from ..base_company import Company\n", "```\n", "\n", "Here, with `..` we go up one level in the folder structure and ask to import the \n", "`Company` class that is defined in the `base_company.py` file. \n", "This is an example of a **relative import**. \n", "\n", "Again, we also import `pandas` which is useful in this sub_module.\n", "\n", "\n", "#### Class constructor\n", "\n", "\n", "Python classes generally have an initialisation method `__init__` that sets up the **object** (also called **instance**).\n", "\n", "This method is called **initializer** or **constructor**.\n", "\n", "It's a **method** which means it is a function that depends on some parameters. \n", "\n", "\n", "\n", "For instance, in our `base_company.py` file, we defined the `Company` class constructor as follows:\n", "\n", "```python\n", "class Company:\n", " def __init__(self, name, ticker=None):\n", " \"\"\"\n", " Initialize a Company instance.\n", "\n", " Parameters:\n", " - name (str): Name of the company.\n", " - ticker (str): Stock ticker symbol if the company is publicly traded.\n", " \"\"\"\n", " self.name = name\n", " self.ticker = ticker\n", "```\n", "\n", "\n", "#### Parameters passing\n", "\n", "In the [example above](#Class-constructor), the `self` parameter is a reference to the instance of the class. It must be there. \n", "\n", "The `name` and `ticker` parameters are used to set the **attributes** of the instance, as can be seen in the constructor body.\n", "\n", "Since `ticker` is presented with an equal sign and a default value, it is called an **optional** parameter.\n", "\n", "However, `name` is not presented in such a way, so it is a **required** parameter.\n", "\n", "\n", "It is common to use two additional parameters objects: `*args` and `**kwargs`.\n", "\n", "- `*args` is used to pass a variable number of positional arguments to the constructor.\n", "- `**kwargs` is used to pass a variable number of keyword arguments to the constructor.\n", "\n", "What does this mean?\n", "\n", "Let us see an example to see how this works and why it can be useful.\n", "\n", "\n", "We add a method to the `Company` class that takes `*args` and `**kwargs` as arguments, whose purpose\n", "is to summarize activties of the company based on info provided in `*args` and `**kwargs`.\n", "\n", "\n", "```python\n", "class Company:\n", "\n", " ...\n", " \n", " def summarize_activity(self, *args, **kwargs):\n", " \"\"\"\n", " Summarizes company activities and additional information.\n", "\n", " Parameters:\n", " - *args: A list of activities related to the company.\n", " - **kwargs: Additional information, like location or date.\n", " \"\"\"\n", " print(f\"\\nActivity Summary for {self.name}:\")\n", " \n", " if args:\n", " print(\"Activities:\")\n", " for activity in args:\n", " print(f\" - {activity}\")\n", " \n", " if kwargs:\n", " print(\"Additional Information:\")\n", " for key, value in kwargs.items():\n", " print(f\" - {key.capitalize()}: {value}\")\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let see it at work:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Company package version: 0.0.post3\n", "\n", "Activity Summary for PharmaCorp:\n", "Activities:\n", " - Researching new drugs\n", " - Launching a public health campaign\n" ] } ], "source": [ "# Import the Company class from the company package\n", "import company as cp\n", "\n", "# Creating a Company instance\n", "company = cp.Company(name=\"PharmaCorp\")\n", "\n", "# Example 1: Using *args to pass activities\n", "company.summarize_activity(\"Researching new drugs\", \"Launching a public health campaign\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this example, we pass two activities as strings positional arguments.\n", "\n", "They are stored in `args` and printed in the method body. \n", "\n", "\n", "Note that other than inside the method, no other part of the code knows about them.\n", "In this sense, these are **local** variables.\n", "\n", "\n", "Let us do a second example with both `*args` and `**kwargs`." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Activity Summary for PharmaCorp:\n", "Activities:\n", " - Researching new drugs\n", " - Launching a public health campaign\n", "Additional Information:\n", " - Location: New York\n", " - Date: 2024-10-27\n" ] } ], "source": [ "# Example 2: Using both *args and **kwargs to provide activities and additional information\n", "company.summarize_activity(\n", " \"Researching new drugs\", \"Launching a public health campaign\",\n", " location=\"New York\", date=\"2024-10-27\"\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, we pass two additional pieces of information as keyword arguments, via `**kwargs`.\n", "\n", "These are stored in `kwargs` as a dictionary and printed in the method body. They are also local variables." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If I wanted to access these variables outside the method, I would need to store them as attributes of the instance. \n", "\n", "For instance, this can be done by adding:\n", "\n", "```python\n", " # Initialize activities if it hasn't been set yet\n", " if not hasattr(self, 'activities'):\n", " self.activities = [] \n", "\n", " # Store activities in the instance\n", " self.activities.extend(args)\n", "\n", " # Set each key-value pair in kwargs as an attribute\n", " for key, value in kwargs.items():\n", " setattr(self, key, value) # Dynamically create an attribute\n", "```\n", "\n", "to the `summarize_activity` method.\n", "\n", "Of course, those attribute would only be set for this specific instance of the `Company` class and \n", "after the method has been called.\n", "\n", "\n", "We can now do:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Activity Summary for PharmaCorp:\n", "Activities:\n", " - Researching new drugs\n", " - Launching a public health campaign\n", "Additional Information:\n", " - Location: New York\n", " - Date: 2024-10-27\n", "\n", "Dynamically set attributes:\n", "Activities: ['Researching new drugs', 'Launching a public health campaign']\n", "Location: New York\n", "Date: 2024-10-27\n" ] } ], "source": [ "# Import the Company class from the company package\n", "import company as cp\n", "\n", "# Creating a Company instance\n", "company = cp.Company(name=\"PharmaCorp\")\n", "\n", "# Example 1: Using *args to pass activities\n", "company.summarize_activity(\"Researching new drugs\", \"Launching a public health campaign\",\n", " location=\"New York\", date=\"2024-10-27\")\n", "\n", "# Accessing dynamically set attributes to understand these are stored\n", "print(\"\\nDynamically set attributes:\")\n", "print(\"Activities:\", company.activities) # Output: list of activities stored in the instance\n", "print(\"Location:\", company.location) # Output: New York\n", "print(\"Date:\", company.date) # Output: 2024-10-27" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Switching order of the arguments:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "ename": "SyntaxError", "evalue": "positional argument follows keyword argument (2611611388.py, line 9)", "output_type": "error", "traceback": [ " \u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[4]\u001b[39m\u001b[32m, line 9\u001b[39m\n\u001b[31m \u001b[39m\u001b[31m\"Researching new drugs\", \"Launching a public health campaign\")\u001b[39m\n ^\n\u001b[31mSyntaxError\u001b[39m\u001b[31m:\u001b[39m positional argument follows keyword argument\n" ] } ], "source": [ "# Import the Company class from the company package\n", "import company as cp\n", "\n", "# Creating a Company instance\n", "company = cp.Company(name=\"PharmaCorp\")\n", "\n", "# Example 1: Using *args to pass activities\n", "company.summarize_activity(location=\"New York\", date=\"2024-10-27\",\n", " \"Researching new drugs\", \"Launching a public health campaign\")\n", "\n", "# Accessing dynamically set attributes to understand these are stored\n", "print(\"\\nDynamically set attributes:\")\n", "print(\"Activities:\", company.activities) # Output: list of activities stored in the instance\n", "print(\"Location:\", company.location) # Output: New York\n", "print(\"Date:\", company.date) # Output: 2024-10-27" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It does not work. The keyword arguments must come after the positional arguments.\n", "\n", "\n", "Now providing only keyword arguments:\n", "\n" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Activity Summary for PharmaCorp:\n", "Additional Information:\n", " - Location: New York\n", " - Date: 2024-10-27\n", "\n", "Dynamically set attributes:\n", "Activities: []\n", "Location: New York\n", "Date: 2024-10-27\n" ] } ], "source": [ "# Import the Company class from the company package\n", "import company as cp\n", "\n", "# Creating a Company instance\n", "company = cp.Company(name=\"PharmaCorp\")\n", "\n", "# Example 1: Using *args to pass activities\n", "company.summarize_activity(location=\"New York\", date=\"2024-10-27\")\n", "\n", "# Accessing dynamically set attributes to understand these are stored\n", "print(\"\\nDynamically set attributes:\")\n", "print(\"Activities:\", company.activities) # Output: list of activities stored in the instance\n", "print(\"Location:\", company.location) # Output: New York\n", "print(\"Date:\", company.date) # Output: 2024-10-27" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It works." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "#### Class inheritance\n", "\n", "Still in the `medical.py` file, we define the `MedicalCompany` **child** (or **derived**) class that inherits from the **parent** (or **base**) `Company` class:\n", "\n", "```python\n", "class MedicalCompany(Company):\n", " ...\n", "```\n", "\n", "If you want the child class to be exaclty the same as the base class, you can use `pass`:\n", "\n", "```python\n", "class MedicalCompany(Company):\n", " pass\n", "```\n", "\n", "With this, all methods of the base class are inherited by the child class.\n", "\n", "
\n", "**Exercise:** Create the `MedicalCompany` class which inherits from the `Company` class and does nothing else.\n", "
\n", "\n", "For example, we get the following behaviour:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "import company as cp\n", "\n", "med_comp = cp.MedicalCompany(name=\"HealthCare Inc.\",ticker=\"HCI\")\n", "med_comp.display_info()\n", "\n", "## this prints the same as would the original/base class.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In general, the child class has additional methods and attributes.\n", "\n", "In this case, we don't use `pass`. Instead, we re-write the methods of the base class,\n", "use `super()` to call the methods of the base class and add new attributes and methods.\n", "\n", "\n", "For example, \n", "\n", "```python\n", "class MedicalCompany(Company):\n", " def __init__(self, name, specialty, drug_manufacturer=False, ticker=None):\n", " super().__init__(name, ticker)\n", " self.specialty = specialty\n", " self.drug_manufacturer = drug_manufacturer\n", "\n", " def display_info(self):\n", " \"\"\"Displays basic information about the medical company.\"\"\"\n", " super().display_info()\n", " print(f\"Medical Specialty: {self.specialty}\")\n", " print(f\"Drug Manufacturer: {'Yes' if self.drug_manufacturer else 'No'}\")\n", "```\n", "\n", "We get the following behaviour:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Company Name: HealthCare Inc.\n", "Ticker Symbol is: HCI\n", "Medical Specialty: Oncology\n", "Drug Manufacturer: Yes\n" ] } ], "source": [ "import company as cp\n", "med_comp = cp.MedicalCompany(name=\"HealthCare Inc.\", specialty=\"Oncology\", drug_manufacturer=True, ticker=\"HCI\")\n", "med_comp.display_info()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "see futher examples in [medical.py](https://github.com/borisbolliet/company_package/blob/main/company/medical/medical.py)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Package data\n", "\n", "\n", "It can be useful (sometimes necessary) to store data in your package. (For example, this can be useful for running minimal examples or when tabulated quantities are needed for specific calculations. In general, it is advised to not include data files in your package distribution and store them somewhere else.)\n", "\n", "We create a `data` folder in the core package folder, and put the data there. \n", "\n", "Let us say we have a dataset with drug approval data `drug_data.csv`.\n", "\n", "We put this file in `company_package/company/data/drug_data.csv`.\n", "\n", "\n", "And we tell our configuration `pyproject.toml` about it:\n", "\n", "```toml\n", "[tool.setuptools.package-data]\n", "\"company\" = [\"data/*\"]\n", "```\n", "\n", "This tells `setuptools` to include the data in the package distribution, and that\n", "the data is in the `company` folder. \n", "\n", "See our [pyproject.toml](https://github.com/borisbolliet/company_package/blob/main/pyproject.toml) file for details.\n", "\n", "Let us create an example and make up a dataset that we then move to the `company_package/company/data/` folder. " ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import os\n", "\n", "# Create the sample drug data as specified\n", "data = {\n", " \"company_name\": [\"PharmaCorp\", \"PharmaCorp\", \"HealthMed\", \"HealthMed\", \"PharmaCorp\", \"PharmaCorp\", \"BioLife\", \"BioLife\"],\n", " \"drug_name\": [\"DrugA\", \"DrugB\", \"DrugC\", \"DrugD\", \"DrugE\", \"DrugF\", \"DrugG\", \"DrugH\"],\n", " \"approval_attempts\": [3, 2, 1, 4, 1, 5, 3, 2],\n", " \"approval_status\": [\"approved\", \"approved\", \"approved\", \"approved\", \"approved\", \"rejected\", \"rejected\", \"approved\"]\n", "}\n", "\n", "# Convert to a DataFrame\n", "drug_data_df = pd.DataFrame(data)\n", "\n", "# Save the DataFrame to a CSV file\n", "file_path = f\"{os.path.expanduser('~')}/Desktop/drug_data.csv\"\n", "drug_data_df.to_csv(file_path, index=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And move the file to the `data` folder in the package.\n", "\n", "Our tree structure now looks like this:\n", "\n", "```bash\n", "company_package/\n", "├── README.md\n", "├── company\n", "│   ├── __init__.py\n", "│   ├── base_company.py\n", "│ ├── version.py\n", "│   ├── medical\n", "│   │   ├── __init__.py\n", "│   │   └── medical.py\n", "│ └── data\n", "│ └── drug_data.csv\n", "...\n", "```\n", "\n", "To see how this is used, look at the `drug_approval_summary` method in the `medical.py` file.\n", "\n", "Note that we use the `files` function from the `importlib.resources` package to get the path to the data file automatically. At the top of the `medical.py` file, we add:\n", "\n", "```python\n", "from importlib.resources import files\n", "```\n", "\n", "\n", "Let us try. " ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Drug Approval Summary for PharmaCorp:\n", " - DrugA: 2 failed attempt(s) before approval\n", " - DrugB: 1 failed attempt(s) before approval\n", " - DrugE: 0 failed attempt(s) before approval\n", " - DrugF: 4 failed attempt(s) before approval\n" ] } ], "source": [ "import company as cp\n", "med_comp = cp.MedicalCompany(name=\"PharmaCorp\", specialty=\"Oncology\", drug_manufacturer=True)\n", "med_comp.drug_approval_summary()\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Turning methods into commands\n", "\n", "Once you have reached a certain level of maturity in your package, you might want to turn \n", "some methods into commands that can be run from the command line.\n", "\n", "To do so, create a `cli.py` file in the core package folder.\n", ", i.e., the tree structure is now:\n", "\n", "```bash\n", "company_package/\n", "├── README.md\n", "├── company\n", "│ ├── __init__.py\n", "│ ├── base_company.py\n", "│ ├── version.py\n", "│ ├── medical\n", "│ │ ├── __init__.py\n", "│ │ └── medical.py\n", "│ └── cli.py\n", "...\n", "```\n", "\n", "Let us implement two commands. One that simply uses the `display_info` method of the `Company` class (call it `display_info`),\n", "and one that actually performs some calculations (call it `get_stock_price_difference`).\n", "\n", "\n", "See the [cli.py](https://github.com/borisbolliet/company_package/blob/main/company/cli.py) for their implementation.\n", "\n", "\n", "Then tell the package that it needs to create \n", "console commands for your package from the methods in the `cli.py` file.\n", "\n", "\n", "Do it in the `pyproject.toml` file, and add:\n", "\n", "```toml\n", "[project.scripts]\n", "company = \"company.cli:main\"\n", "```\n", "\n", "Now, in bash, you can run, for instance:\n", "\n", "```bash\n", "company display_info --ticker=AAPL\n", "```\n", "\n", "Let's do it in the notebook." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Company package version: 0.0.post2\n", "Company Name: N/A\n", "Ticker Symbol is: AAPL\n" ] } ], "source": [ "!company display_info --ticker AAPL" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us now use the second command." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Company package version: 0.0.post2\n", "Stock price difference for NVDA over 5mo ending 2025-09-23: 47.15650939941406\n" ] } ], "source": [ "!company get_stock_price_difference --ticker NVDA --interval 5mo --stop_date 2025-09-23" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can ask for help on the commands by running:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Company package version: 0.0.post2\n", "usage: company [-h] {display_info,get_stock_price_difference} ...\n", "\n", "Company CLI Tool\n", "\n", "positional arguments:\n", " {display_info,get_stock_price_difference}\n", " display_info Display company information\n", " get_stock_price_difference\n", " Get stock price difference\n", "\n", "options:\n", " -h, --help show this help message and exit\n" ] } ], "source": [ "!company --help" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And further details on a specific command by running:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Company package version: 0.0.post2\n", "usage: company get_stock_price_difference [-h] --ticker TICKER\n", " [--interval INTERVAL] --stop_date\n", " STOP_DATE\n", "\n", "options:\n", " -h, --help show this help message and exit\n", " --ticker TICKER Stock ticker symbol (e.g., AAPL).\n", " --interval INTERVAL Time period (e.g., '1y', '6mo', '2y').\n", " --stop_date STOP_DATE\n", " End date in YYYY-MM-DD format.\n" ] } ], "source": [ "!company get_stock_price_difference --help" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "**Exercise:** Create your own command that does something interesting.\n", "\n", "Where are the commands stored?\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Naming conventions\n", "\n", "The style guide for Python code is called [PEP 8](https://www.python.org/dev/peps/pep-0008/).\n", "\n", "PEP means Python Enhancement Proposal. There are many PEPs, each has a number and they all are on this [page](https://peps.python.org/).\n", "\n", "Here’s a guide to naming conventions in Python following the PEP 8 style guide and best practices:\n", "\n", "\n", "### Package and File Names\n", "\n", "- **Convention:** Use lowercase letters. You can use underscores (`_`) when necessary.\n", "\n", "- **Reason:** Keeps names **concise** and **readable**, and avoids **naming conflicts**.\n", "\n", "**Examples**:\n", "\n", "- Package: `my_package`, `data_tools`\n", "\n", "- File: `process_data.py`, `utils.py`\n", "\n", "\n", "### Modules\n", "\n", "- **Convention:** Same as file names (lowercase, underscores if needed).\n", "\n", "- **Reason:** Module names are usually file names.\n", "\n", "**Examples**:\n", "\n", "- `data_analysis`, `file_handler`\n", "\n", "\n", "### Classes\n", "\n", "- **Convention:** Use `PascalCase` (aka **CapitalizedWords**).\n", "\n", "- **Reason:** Easily distinguish classes from variables or functions/methods.\n", "\n", "**Examples**:\n", "\n", "- `DataProcessor`, `MyCustomException`\n", "\n", "\n", "\n", "### Methods\n", "\n", "- **Convention:** Use `snake_case` (all lowercase with underscores between words).\n", "\n", "- **Reason:** Matches function naming convention in Python.\n", "\n", "**Examples**:\n", "\n", "- `process_data`, `get_user_input`\n", "\n", "\n", "### Variables\n", "\n", "- **Convention:** Use names in format like `snake_case`.\n", "\n", "- **Reason:** Matches Python's style for variables.\n", "\n", "**Examples**:\n", "\n", "- `user_name`, `max_value`\n", "\n", "\n", "### Additional Notes\n", "\n", "- Constants should use `UPPERCASE_WITH_UNDERSCORES`.\n", "\n", " - Example: `DEFAULT_TIMEOUT`, `MAX_RETRIES`\n", "\n", "- Private or \"internal use only\" variables/methods should begin with a single underscore.\n", "\n", " - Example: `_private_method`, `_internal_cache`\n", "\n", "\n", "Here is an example of what a **private method** and variable look like:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "10\n", "15\n" ] } ], "source": [ "class Calculator:\n", " def __init__(self):\n", " self._factor = 2 # Private attribute for internal use\n", "\n", " def multiply(self, number):\n", " \"\"\"Public method to multiply a number by the private factor.\"\"\"\n", " return self._private_multiply(number, self._factor)\n", "\n", " def _private_multiply(self, num1, num2):\n", " \"\"\"Private method to perform multiplication.\"\"\"\n", " return num1 * num2\n", "\n", "# Usage example\n", "calc = Calculator()\n", "\n", "# Using the public method (preferred)\n", "result = calc.multiply(5)\n", "print(result) # Output: 10\n", "\n", "# Accessing the private method directly (discouraged but possible)\n", "direct_result = calc._private_multiply(5, 3)\n", "print(direct_result) # Output: 15\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this example, `_private_multiply` is a private method because it starts with a single underscore, in principle it should never be used outside the class.\n", "Similarly, `_factor` is a private variable because it starts with a single underscore and should never be accessed outside the class.\n", "\n", "\n", "What is the point of this?\n", "\n", "To hide what is under the kitchen sink, i.e., the internal details that your users do not need to know about." ] } ], "metadata": { "kernelspec": { "display_name": "cpenv", "language": "python", "name": "cpenv" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.9" } }, "nbformat": 4, "nbformat_minor": 4 }