
Monthly cost for running Airflow for whole month on t3.medium will be around 32.37 USD and can be calculated hereĪstronomer cost around 100$/month per 10 AU(1 CPU, 3. They have some of the top Airflow contributes in their team Other third party managed option to run airflow on AWSĪstronomer is a company that provides Fully hosted Airflow on all cloud platforms with advanced features of monitoring etc. from .operators. In order to access this in browser from any other computer (you own laptop)įirst, enable ec2 http port 8080 from security group for your IPĪnd from browser, you would be able to access it as :8080 AWS Glue - Fully managed extract, transform, and load (ETL). In order to enter command line (for starting the executor, scheduler etc), grab the container name/id from docker psĪnd use this command docker exec -ti bashĪlso, in order to mount the ec2 folder with docker dags folder, you can mount it like below and your dags will be synced with airflow dags docker run -d -p 8080:8080 -v /path/to/dags/on/your/ec2/folder/:/usr/local/airflow/dags puckel/docker-airflow webserver Airflow - A platform to programmaticaly author, schedule and monitor data pipelines, by Airbnb. Now you can run it as docker run -d -p 8080:8080 puckel/docker-airflow webserver Next, you may need to add a user in postgres. So change this line in DockerFile to use another version of airflow, higher like 1.10.10 ARG AIRFLOW_VERSION=1.10.9 # change this to 1.10.10 ( or hardcode sqlalchmy version) Here, you might get problem with SQLAlchemy version conflit(if not ignore this step. Next, pull the image from the docker docker pull puckel/docker-airflow

#AIRFLOW AWS INSTALL#
You can install and configure an airflow normally on ec2 as you do in your local computer, but I prefer setting it up with docker image by puckel here.įirst, you can either use AMI having docker installed or install it yourself. Source: AWS ECR To create a repository, hop into the ECR console and click on Create repository and choose whatever name you feel adequate. We can create a workflow that will run after every 7 days and clean up the log files, So in terms of disk it won't be a problem if the memory consumption remains constant. You can deploy airflow on EC2 instances with docker/airflow images. Following can be the options and guide for deploying it to AWS EC2. Running locally is usually not a feasible post test phase.
