They all have the mid night of previous day as you can see as folowing. It will have the same start_date during different scheduler times in a day. We can take 3 examples as following.Ĭalculate the start_date based on the time when the scheduler runs, it changes as well when given config such as days_ago(1). Since it runs peridically, so those times probably are changing during the day. Calculate previous_pre_cron, previous_cron and next_cron time based on the time when the scheduler runs.Let’s assume some facts to continue a calculation example. Adjust start_date to align with schedule_interval.Cron time calculation, previous_pre_cron, previous_cron, next_cron.To answer the question, we need to do 4 steps to get the result If you cron means every 2 days, then the schedule_interval shall be 2 day. normalize_schedule(08-14 01:00:00)=08-14 01:00:00įrom FAQ, we know that Airflow scheduler triggers the task soon after the start_date + schedule_interval is passed, which I doult it results in confusion when it comes to cron schedule_interval context.įrom the code logic, I think it means the execution_date + schedule_interval. If a start_time equals to a cron time, then the result will be the same. which means next cron time from the start date. It will try to align the start_date to one of the cron times.įor examples, cron times is 08-14 01:00:00 and 08-16 01:00:00, any start_time in between, e.g. The next_run_date will be DAG Run execution_date. It is the adjusted start_date that will be normalized. Normalize_schedule to next_run_date which is the execution date, which is named as normalize_schedule in the code logic. It picks the later one from previous_pre_cron and the resolved start_date and update dag.start_date 1ĭag.start_date = max(previous_pre_cron, dag.start_date) In our scope, we can think the start_date will be adjusted as following rules. The start_date of a DAG will be adjusted by the scheduler. Previous_pre_cron -> previous_cron -> utcnow() -> next_cron. It’s very important to realise that start_date + (1 day) != utcnow()ĭAG start_date adjustment, airflow will start subprocesses to create DAG Runs, it firstly checks the schedule_interval and calculate previous cron time(previous_cron), the further previous time(previous_pre_cron) and next cron time(next_cron) based on current time. Start_date = utcnow() - (1 day) and By default the time is set to midnight, e.g. Will the first DAG Run be kicked off by airflow scheduler? Concepts from codeĭAG start_date resolve, the scheduler is parsing the DAGs every 5 seconds (depending on setup).Įach time when the scheduler is running, it will calculate a start_date depending on current time(utcnow()).ĭays_ago(1) will be resolved as following. The start_date is set in default args ONLYĪ DAG is using cron string or preset as schedule_interval, 0 1 2-30/2 * * Issue to explain What scopeĪ DAG has start_date not set as datetiem.timedelta, it could e.g. In this post, I’ll try to explain the outcome. So what’s wrong with the quote? Why Yuxia’s DAG was not running?Ĭuriously, I checked the code logic in the scheduler_job In fact, when she checked the system, there was not DAG Run started at 01:00:00. Today is, initially from above quote, the start_date will be and the first DAG Run shall be created, but my cron says it shouldn’t create a DAG Run yesterday since it’s odd day. I can check the correctness with (Crontab guru).1ĭag = DAG('demo-cron-schedule-interval', default_args = default_args, schedule_interval='0 1 2-30/2 * *'. When my friend Yuxia comes to discuss about her case about running a dag every even day at 1 a.m., I thought it was so easy to do that. The notes from airflow official website makes sense when you look at it in the first look, however, when you try to create you airflow DAG with a cron string, you never know what it means. Subsequent DAG Runs are created by the scheduler process, based on your DAG’s schedule_interval, sequentially. The first DAG Run is created based on the minimum start_date for the tasks in your DAG. When a dag will be kick off? Will it be started? Airflow DAG start_date with days_ago is making us confused all the time.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |