Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Note
Apache Airflow job is powered by Apache Airflow.
A Python package lets you organize related Python modules into a single directory hierarchy. A package is typically represented as a directory that contains a special file called init.py. Inside a package directory, you can have multiple Python module files (.py files) that define functions, classes, and variables. With Apache Airflow Jobs, you can develop your own private packages to add custom Apache Airflow operators, hooks, sensors, plugins, and more.
In this tutorial, you'll build a simple custom operator as a Python package, add it as a requirement in your Apache Airflow job, and import your private package as a module in your DAG file.
Develop a custom operator and test with an Apache Airflow Dag
Create a file called
sample_operator.py
and turn it into a private package. If you need help, check out this guide: Creating a package in pythonfrom airflow.models.baseoperator import BaseOperator class SampleOperator(BaseOperator): def __init__(self, name: str, **kwargs) -> None: super().__init__(**kwargs) self.name = name def execute(self, context): message = f"Hello {self.name}" return message
Next, create an Apache Airflow DAG file called
sample_dag.py
to test the operator you made in the first step.from datetime import datetime from airflow import DAG # Import from private package from airflow_operator.sample_operator import SampleOperator with DAG( "test-custom-package", tags=["example"] description="A simple tutorial DAG", schedule_interval=None, start_date=datetime(2021, 1, 1), ) as dag: task = SampleOperator(task_id="sample-task", name="foo_bar") task
Set up a GitHub Repository with your
sample_dag.py
file inDags
folder, along with your private package file. You can use formats likezip
,.whl
, ortar.gz
. Put the file in either the 'Dags' or 'Plugins' folder, whichever fits best. Connect your Git Repository to your Apache Airflow Job, or try the ready-made example at Install-Private-Package.
Add your package as a requirement
Add the package under Airflow requirements
using the format /opt/airflow/git/<repoName>/<pathToPrivatePackage>
For example, if your private package sits at /dags/test/private.whl
in your GitHub repo, just add /opt/airflow/git/<repoName>/dags/test/private.whl
to your Airflow environment.