'Set domainMin to 6 months before max date in data

I have the following Vega-Lite chart:

Open the Chart in the Vega Editor

Currently, I have the scale set as follows:

"scale": {"domainMin": "2021-06-01"}

However, what I really want is for the domainMin to be automatically calculated to be 6 months before the latest date in the notification_date field in the data. I've looked at aggregate and expressions, but it's not exactly clear.

How can I get the maximum value of notification_date and subtract 6 months from it, and use that in "domainMin"?

Edit: To clarify, I don't want to filter the data. I want the user to be able to zoom out or pan to see the data outside the initial 6-month window. I get exactly what I want with "scale": {"domainMin": "2021-06-01"}, but this becomes out-of-date very quickly.



Solution 1:[1]

I have tried giving params and expr to domainMin, but I was unable to use the data fields in expr through datum.

The 2nd approach I tried will work for you, in this you will need to make use of joinaggregate/calculate/filter transforms. You will manually gather the max year and max months and then use it to filter your data.

Below is the modified config or refer the editor url:

{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
  "width": "container",
  "height": "container",
  "config": {
    "group": {"fill": "#e5e5e5"},
    "arc": {"fill": "#2b2c39"},
    "area": {"fill": "#2b2c39"},
    "line": {"stroke": "#2b2c39"},
    "path": {"stroke": "#2b2c39"},
    "rect": {"fill": "#2b2c39"},
    "shape": {"stroke": "#2b2c39"},
    "symbol": {"fill": "#2b2c39"},
    "range": {
      "category": [
        "#2283a2",
        "#003e6a",
        "#a1ce5e",
        "#FDBE13",
        "#F2727E",
        "#EA3F3F",
        "#25A9E0",
        "#F97A08",
        "#41BFB8",
        "#518DCA",
        "#9460A8",
        "#6F7D84",
        "#D1DCA5"
      ]
    }
  },
  "title": "South Western Sydney Cumulative and Daily COVID-19 Cases by LGA",
  "data": {
    "url": "https://davidwales.github.io/nsw-covid-19-data/confirmed_cases_table1_location.csv"
  },
  "transform": [
    {
      "filter": {
        "and": [
          {"field": "lhd_2010_name", "equal": "South Western Sydney"},
          {"not": {"field": "lga_name19", "equal": "Penrith"}}
        ]
      }
    },
    {"calculate": "utcyear(datum.notification_date)", "as": "yearNumber"},
    {"calculate": "utcmonth(datum.notification_date)", "as": "monthNumber"},
    {
      "window": [
        {"op": "count", "field": "notification_date", "as": "cumulative_count"}
      ],
      "frame": [null, 0]
    },
    {
      "joinaggregate": [
        {"field": "monthNumber", "op": "max", "as": "max_month_count"},
        {"field": "yearNumber", "op": "max", "as": "max_year"}
      ]
    },
    {"calculate": "abs(datum.max_month_count-6)", "as": "min_month_count"},
    {
      "filter": "datum.min_month_count < datum.monthNumber && datum.max_year === datum.yearNumber"
    }
  ],
  "layer": [
    {
      "selection": {
        "date": {"type": "interval", "bind": "scales", "encodings": ["x"]}
      },
      "mark": {"type": "bar", "tooltip": true},
      "encoding": {
        "x": {
          "timeUnit": "yearmonthdate",
          "field": "notification_date",
          "type": "temporal",
          "title": "Date"
        },
        "color": {
          "field": "lga_name19",
          "type": "nominal",
          "title": "LGA",
          "legend": {"orient": "top", "columns": 4}
        },
        "y": {
          "aggregate": "count",
          "field": "lga_name19",
          "type": "quantitative",
          "title": "Cases",
          "axis": {"title": "Daily Cases by SWS LGA"}
        }
      }
    },
    {
      "mark": "line",
      "encoding": {
        "x": {
          "timeUnit": "yearmonthdate",
          "field": "notification_date",
          "title": "Date",
          "type": "temporal"
        },
        "y": {
          "aggregate": "max",
          "field": "cumulative_count",
          "type": "quantitative",
          "axis": {"title": "Cumulative Cases"}
        }
      }
    }
  ],
  "resolve": {"scale": {"y": "independent"}}
}

Solution 2:[2]

A bit simpler approach to filter approximately the last six months of data might look like this:

  "transform": [
    ...,
    {"joinaggregate": [{"op": "max", "field": "notification_date", "as": "last_date"}]},
    {"filter": "datum.notification_date > datum.last_date - 6 * 30 * 24 * 60 * 60 * 1000"}
  ]

It makes use of the fact that dates are stored as millisecond time-stamps, and has the benefit that it will work across year boundaries.

Solution 3:[3]

This is not quite an answer to the question you asked, but if your data is reasonably up-to-date (that is, the most recent data point is close to the current date), you can do something like this:

"scale": { "domainMin": { "expr": "timeOffset('month', now(), -6)" } }

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 wahab memon
Solution 2
Solution 3 Stephen Morgan