Arthur Camberlein >> SEO & Data articles >> Find missing date data with Python

Find missing date data with Python

Written by Arthur Camberlein | Published on & updated on

How to leverage Python to fill-out the missing dates on a file.

Note that this could also be achieved with many other programming languages, statistical languages (Julia, R, ...) or even Excel or Spreadsheets. This is one example among others.

Library to leverage

For this example, I will leverage the pandas library. Be sure to install it. And why pandas? Because it's the most common and easier to use, with built-in functions.

The how-to

A few step, quick steps, to find the missing date data into your dataframe, thanks to Python.

Create a list

Create a list with the missing dates, this could be exporting from any data you may have (Analytics, Google Search Console, Conversion from your CMS). Let's call it date_missing.

Convert the list into a datetime series

date_series = pd.to_datetime(pd.Series(date_missing))

Create min and max on your date range

Create a date range from minimum to maximum date in your data

all_dates = pd.date_range(start = date_series.min(), end = date_series.max())

Find and define dates in not in your dataset

Find dates that are in all_dates but not in your data

missing_dates = all_dates.difference(date_series)

The full script

import pandas as pd

# Assuming 'dates' is your list of dates
date_missing = [
"2023-08-17",
"2023-08-18",
"2023-08-20",
"2023-08-21",
"2023-08-22",
"2023-08-23",
"2023-08-28",
"2023-08-29",
"2023-08-30",
"2023-08-31",
"2023-09-02",
"2023-09-05",
"2023-09-07",
"2023-09-08",
"2023-09-09",
"2023-09-10",
"2023-09-13",
"2023-09-15",
"2023-09-16",
"2023-09-17",
"2023-09-21",
"2023-09-22",
"2023-09-23",
"2023-09-27",
"2023-09-28"]  # replace with your list

# Convert the list into a datetime series
date_series = pd.to_datetime(pd.Series(date_missing))

# Create a date range from minimum to maximum date in your data
all_dates = pd.date_range(start = date_series.min(), end = date_series.max())

# Find dates that are in all dates but not in your data
missing_dates = all_dates.difference(date_series)

print(missing_dates)
Retour au blog

Learn more with the article FAQ

Find missing date data with Python - FAQs

Blog post taggued in: Data, Python, Python SEO, Tips

Written by