Skip to content

Instantly share code, notes, and snippets.

@Kcnarf
Last active September 13, 2019 10:39
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save Kcnarf/8c462789ffbb04351a11 to your computer and use it in GitHub Desktop.
Save Kcnarf/8c462789ffbb04351a11 to your computer and use it in GitHub Desktop.
timeline - seasonality detection (II)
license: mit

This block is an experimentation of how to detect if a timeline has a seasonality component, and how to detect the lenght of the season (if any).

Seasonality means that the time serie has a periodic component, repeating the same pattern on each period. For example, sales of a store may have a week-based seasonality: sales increase on saturday, while there is no sale at all on sunday.

Graphically speaking, detecting a seasonality is (quite) easy: just look for a repeating pattern. Note that it could be difficult if the pattern has a long period, or/and the order of magnitude of the seasonilaty is low (ie. lowest and highest values are not so far from the season's mean, but in this case there might be no seasonality at all ! ).

Computationnaly speaking, one can use the correlogram. This diagram represents all the coefficients of autocorrelation of the time serie (go to this block for detailed explanations of what is a coefficient of autocorrelation, and how to compute it). With the help of this diagram, one can identify season's lenght, if any.

Usages :

  • in the left graph, Drag & Drop points to update the timeline and create seasons of your choice (below the graph are some shortcuts)
  • decrease the order of magnitude of the seasonality component to see that when this order is small, then it becomes difficult to detect a season: coefficient of correlation for each lag are constantly high (see below comment for details);
  • similarily with the previous comment, increase the trend of the timeline to see that the higher is the trend, the more difficult it is to detect a season; each coefficient of correlation is high because their corresponding lagged time serie and the original time serie behave the same way (they have the same trend), and the seasonality component becomes less important;
  • go to this block in order to understand that detrending the time serie before computing the correlogram is a must have because it nullifies the previous comment, allowing to detect very small seasonnality order of magnitude;

Notes:

  • another block experiments detrending before computing the correlogram
  • another block experiments autocorrelation
  • another block experiments time series correlation
  • another block deals with the impact of seasonality when computing the trend of a timeline

Acknowledgments:

<!DOCTYPE html>
<meta charset="utf-8">
<style>
body {
position: relative;
background-color: #ddd;
margin: auto;
}
#under-construction {
display: none;
position: absolute;
top: 200px;
left: 300px;
font-size: 40px;
}
.controls {
position: absolute;
font: 11px arial;
}
#controls1 {
top: 300px;
left: 10px;
}
#controls2 {
top: 300px;
left: 450px;
}
#controls3 {
top: 300px;
right: 10px;
text-align: right;
}
.viz {
position: absolute;
background-color: white;
border-radius: 10px;
left: 5px;
}
.viz#timelines {
top: 5px;
}
.viz#correlation {
top: 355px;
}
.flow {
position: absolute;
font-size: 30px;
color: darkgrey;
top: 320px;
right: 435px;
}
.axis path,
.axis line {
fill: none;
stroke: black;
shape-rendering: crispEdges;
}
.axis text {
font-family: sans-serif;
font-size: 11px;
}
.grid>line, .grid>.intersect {
fill: none;
stroke: #ddd;
shape-rendering: crispEdges;
vector-effect: non-scaling-stroke;
}
.legend {
font-size: 12px;
}
.dot {
fill: steelblue;
stroke: white;
stroke-width: 3px;
}
.dot.draggable:hover, .dot.dragging {
fill: pink;
cursor: ns-resize;
}
.timeline {
fill: none;
stroke: lightsteelblue;
stroke-width: 2px;
}
.timeline.draggable:hover, .timeline.dragging {
stroke: pink;
opacity: 1;
cursor: ns-resize;
}
.correlation-bar {
fill: grey;
}
</style>
<body>
<div id="timelines" class="viz">
<div id="controls1" class="controls">
update time serie with a seasonality's length of <a href="#" onclick="updateSeasonalityPeriod(2);">2</a> /<a href="#" onclick="updateSeasonalityPeriod(3);">3</a> / <a href="#" onclick="updateSeasonalityPeriod(4);">4</a> / <a href="#" onclick="updateSeasonalityPeriod(5);">5</a> / <a href="#" onclick="updateSeasonalityPeriod(6);">6</a> / <a href="#" onclick="updateSeasonalityPeriod(7);">7</a> / <a href="#" onclick="updateSeasonalityPeriod(8);">8</a> / <a href="#" onclick="updateSeasonalityPeriod(9);">9</a> / <a href="#" onclick="updateSeasonalityPeriod(10);">10</a> periods
</div>
<div id="controls2" class="controls">
<a href="#" onclick="increaseSeasonOrderOfMagnitude();">increase</a> / <a href="#" onclick="decreaseSeasonOrderOfMagnitude();">decrease</a> seasonality's order of magnitude
</div>
<div id="controls3" class="controls">
<a href="#" onclick="increaseTrend();">increase</a> / <a href="#" onclick="decreaseTrend();">decrease</a> timeline's trend
</div>
</div>
<div id="correlation" class="viz"></div>
<div id="flow" class="flow">&#8615;</div>
<div id="under-construction">
UNDER CONSTRUCTION
</div>
<script src="https://d3js.org/d3.v3.min.js"></script>
<script>
var timeSerie = [];
var randomness = [];
var currentSeasonLength = 4;
var currentSeasonOrderOfMagnitude = 8;
var currentTrend = 1.4;
var shouldDetrend = false;
var WITH_TRANSITION = true;
var WITHOUT_TRANSITION = false;
var duration = 500;
var NEW_RANDOMNESS = true;
var PRESERVE_RANDOMNESS = false;
var timelineVizDimension = {width: 960, height:340},
correlationVizDimension = {width: 960, height:160},
vizMargin = 5,
flowWidth = 20
legendHeight = 20,
xAxisLabelHeight = 10,
yAxisLabelWidth = 10,
correlationLeftShift = 110,
margin = {top: 20, right: 20, bottom: 20, left: 20},
timelineSvgWidth = timelineVizDimension.width - 2*vizMargin,
timelineSvgHeight = timelineVizDimension.height - 2*vizMargin - flowWidth/2,
correlationSvgWidth = correlationVizDimension.width - 2*vizMargin,
correlationSvgHeight = correlationVizDimension.height - 2*vizMargin - flowWidth/2,
timelineWidth = timelineSvgWidth - margin.left - margin.right - yAxisLabelWidth,
timelineHeight = timelineSvgHeight - margin.top - margin.bottom - xAxisLabelHeight - 1.5*legendHeight,
correlationWidth = correlationSvgWidth - margin.left - margin.right - yAxisLabelWidth - correlationLeftShift,
correlationHeight = correlationSvgHeight - margin.top - margin.bottom;
var drag = d3.behavior.drag()
.origin(function(d) { return d; })
.on("dragstart", dragStarted)
.on("drag", dragged1)
.on("dragend", dragEnded);
var x = d3.scale.linear()
.domain([0, 20])
.range([0, timelineWidth]);
var y = d3.scale.linear()
.domain([0, 50])
.range([0, -timelineHeight]);
var xCorrelation = d3.scale.linear()
.domain([1, 11])
.range([0, correlationWidth]);
var yCorrelation = d3.scale.linear()
.domain([-1, 1])
.range([0, -correlationHeight]);
var xAxisDef = d3.svg.axis()
.scale(x)
.ticks(20);
var yAxisDef = d3.svg.axis()
.scale(y)
.orient("left");
var xAxisCorrelationDef = d3.svg.axis()
.scale(xCorrelation)
.tickValues([2,3,4,5,6,7,8,9,10])
.tickFormat("");
var yAxisCorrelationDef = d3.svg.axis()
.scale(yCorrelation)
.ticks(5)
.orient("left");;
var svg = d3.select("#timelines").append("svg")
.attr("width", timelineSvgWidth)
.attr("height", timelineSvgHeight)
.append("g")
.attr("transform", "translate(" + [margin.left, margin.top] + ")");
var container = svg.append("g")
.attr("id", "graph")
.attr("transform", "translate(" + [yAxisLabelWidth, timelineHeight] + ")");
var grid = container.append("g")
.attr("class", "grid");
var intersects = [];
d3.range(1, x.invert(timelineWidth)+1, 1).forEach(function(a) { d3.range(5, y.invert(-timelineHeight)+5,5).forEach(function(b) { intersects.push([a,b])})});
grid.selectAll(".intersect")
.data(intersects)
.enter().append("path")
.classed("intersect", true)
.attr("d", function(d) { return "M"+[x(d[0])-1,y(d[1])]+"h3M"+[x(d[0]),y(d[1])-1]+"v3"});
container.append("text")
.attr("transform", "translate(" + [timelineWidth/2, -timelineHeight] + ")")
.attr("text-anchor", "middle")
.text("Timeline");
container.append("g")
.attr("class", "axis x")
.call(xAxisDef)
.append("text")
.attr("x", timelineWidth)
.attr("y", -6)
.style("text-anchor", "end")
.text("Time");
container.append("g")
.attr("class", "axis y")
.call(yAxisDef)
.append("text")
.attr("transform", "rotate(-90)")
.attr("x", timelineHeight)
.attr("y", 16)
.style("text-anchor", "end")
.text("Amount");
var timeline = container.append("path")
.datum(1)
.classed("timeline serie1", true)
.attr("d", line)
var dotContainer = container.append("g")
.classed("dots", true);
svg = d3.select("#correlation").append("svg")
.attr("width", correlationSvgWidth)
.attr("height", correlationSvgHeight)
.append("g")
.attr("transform", "translate(" + [margin.left, margin.top] + ")");
container = svg.append("g")
.attr("id", "graph correlation")
.attr("transform", "translate(" + [yAxisLabelWidth + correlationLeftShift, correlationHeight] + ")");
var correlationTitle = container.append("text")
.attr("transform", "translate(" + [correlationWidth/2, -correlationHeight] + ")")
.attr("text-anchor", "middle")
.text("Correlogram");
grid = container.append("g")
.attr("class", "grid");
intersects = [];
d3.range(2, xCorrelation.invert(correlationWidth), 1).forEach(function(a) { d3.range(-1, yCorrelation.invert(-correlationHeight)+0.5,0.5).forEach(function(b) { intersects.push([a,b])})});
grid.selectAll(".intersect")
.data(intersects)
.enter().append("path")
.classed("intersect", true)
.attr("d", function(d) { return "M"+[xCorrelation(d[0])-1,yCorrelation(d[1])]+"h3M"+[xCorrelation(d[0]),yCorrelation(d[1])-1]+"v3"});
container.append("g")
.attr("class", "axis y")
.call(yAxisCorrelationDef);
var xAxisContainer = container.append("g")
.attr("class", "axis x");
xAxisContainer.append("line")
.attr("x1", 0)
.attr("y1", yCorrelation(0))
.attr("x2", correlationWidth)
.attr("y2", yCorrelation(0))
xAxisContainer.append("text")
.attr("transform", "translate("+[xCorrelation(2)-40,yCorrelation(-1)+15]+")")
.attr("text-anchor", "end")
.text("Coefficient of autocorrelation for ... ")
var xTicks = container.select(".axis.x").selectAll(".tick-label")
.data([2,3,4,5,6,7,8,9,10])
.enter()
.append("g")
.classed("tick-label", true)
.attr("transform", function(d) { return "translate("+[xCorrelation(d),0]+")"});
xTicks.append("line")
.attr("x1", 0)
.attr("y1", yCorrelation(0)-3)
.attr("x2", 0)
.attr("y2", yCorrelation(0)+4)
xTicks.append("text")
.attr("transform", "translate("+[0,yCorrelation(-1)+15]+")")
.attr("text-anchor", "middle")
.text(function(d) { return d+"-periods lag"});
var barContainer = container.append("g")
.attr("id", "bar-conatiner");
d3.csv("timeserie.csv", dottype, function(error, dots) {
updateTimeSerie(PRESERVE_RANDOMNESS);
updateDots(WITHOUT_TRANSITION);
updateTimelines(WITHOUT_TRANSITION);
updateAutocorrelations(WITHOUT_TRANSITION);
});
function dottype(d) {
d.x = +d.x;
d.y = +d.y+(+d.random);
timeSerie.push(d);
randomness.push(+d.random);
return d;
}
var line = d3.svg.line()
.x(function(d) { return x(d.x); })
.y(function(d) { return y(d.y); });
function updateSeasonalityPeriod(newSeasonLength) {
currentSeasonLength = newSeasonLength;
updateTimeSerie(NEW_RANDOMNESS);
}
function increaseTrend() {
currentTrend *= 1.6;
updateTimeSerie(PRESERVE_RANDOMNESS);
}
function decreaseTrend() {
currentTrend *= 0.625;
updateTimeSerie(PRESERVE_RANDOMNESS);
}
function increaseSeasonOrderOfMagnitude() {
currentSeasonOrderOfMagnitude *= 1.6;
updateTimeSerie(PRESERVE_RANDOMNESS);
}
function decreaseSeasonOrderOfMagnitude() {
currentSeasonOrderOfMagnitude *= 0.625;
updateTimeSerie(PRESERVE_RANDOMNESS);
}
function trend() {
shouldDetrend = false;
updateTimeSerie(PRESERVE_RANDOMNESS);
}
function detrend() {
shouldDetrend = true;
updateTimeSerie(PRESERVE_RANDOMNESS);
}
function handleDetrend(cb) {
shouldDetrend = cb.checked;
updateTimeSerie(PRESERVE_RANDOMNESS);
}
function updateTimeSerie(withRandom) {
var trend = shouldDetrend ? 0 : currentTrend;
var intercept = 10;
var expected;
timeSerie.forEach(function(d,i) {
expected = trend*d.x+intercept;
switch (i%currentSeasonLength) {
case 0: expected -= currentSeasonOrderOfMagnitude; break;
case (currentSeasonLength-1): expected += currentSeasonOrderOfMagnitude; break;
}
if (withRandom) {
randomness[i] = 3*(Math.random()-0.5);
}
d.y = expected + randomness[i];
})
updateDots(WITH_TRANSITION);
updateTimelines(WITH_TRANSITION);
updateAutocorrelations(WITH_TRANSITION);
}
function updateDots(withTransition) {
var dots = dotContainer.selectAll(".dot.serie1")
.data(timeSerie);
dots.enter()
.append("circle")
.classed("dot draggable serie1", true)
.attr("r", 5)
.attr("cx", function(d) { return x(d.x); })
.call(drag);
dots.transition()
.duration(withTransition? duration : 0)
.attr("cy", function(d) { return y(d.y); })
}
function updateTimelines(withTransition) {
timeline.transition()
.duration(withTransition? duration : 0)
.attr("d", line(timeSerie));
}
function updateAutocorrelations(withTransition) {
var dataForAutocorrelationCoefficients = [];
var autocorCount = 9;
var lag = 2;
while (lag<=autocorCount+1) {
dataForAutocorrelationCoefficients.push({
lag: lag,
ySum: 0,
squareYSum: 0,
laggedYSum: 0,
squareLaggedYSum: 0,
yLaggedYSum: 0
})
lag++;
}
timeSerie.forEach(function(tsData, tsIndex){
dataForAutocorrelationCoefficients.forEach(function(autocorData) {
if (tsIndex>=autocorData.lag) {
var laggedY = timeSerie[tsIndex-autocorData.lag].y
autocorData.ySum += tsData.y;
autocorData.squareYSum += Math.pow(tsData.y, 2);
autocorData.laggedYSum += laggedY;
autocorData.squareLaggedYSum += Math.pow(laggedY, 2);
autocorData.yLaggedYSum += (tsData.y)*(laggedY);
}
})
})
var autocorrelationCoefficients = [];
dataForAutocorrelationCoefficients.forEach(function(autocorData) {
var autocorSerieLength = timeSerie.length-autocorData.lag;
var yMean = autocorData.ySum/autocorSerieLength;
var laggedYMean = autocorData.laggedYSum/autocorSerieLength;
var yVariance = autocorData.squareYSum/autocorSerieLength - Math.pow(yMean, 2);
var laggedYVariance = autocorData.squareLaggedYSum/autocorSerieLength - Math.pow(laggedYMean, 2);
var yStdDev = Math.pow(yVariance, 0.5)
var laggedYStdDev = Math.pow(laggedYVariance, 0.5)
var yLaggedYCovariance = autocorData.yLaggedYSum/autocorSerieLength - yMean*laggedYMean;
var correlatedTrend = yLaggedYCovariance/(yVariance);
var correlatedIntercept = laggedYMean - correlatedTrend*yMean;
var autocorCoef = yLaggedYCovariance/(yStdDev*laggedYStdDev);
autocorrelationCoefficients.push({
autocorIndex: autocorData.lag,
autocorCoef: autocorCoef
});
});
bars = barContainer.selectAll(".correlation-bar")
.data(autocorrelationCoefficients);
bars.enter().append("path")
.classed("correlation-bar", true)
.attr("d", function(d) { return "M"+[xCorrelation(d.autocorIndex)-10,yCorrelation(0)]+"h20V"+yCorrelation(d.autocorCoef)+"h-20z" });
bars.transition()
.duration(withTransition? duration : 0)
.attr("d", function(d) { return "M"+[xCorrelation(d.autocorIndex)-10,yCorrelation(0)]+"h20V"+yCorrelation(d.autocorCoef)+"h-20z" });;
}
function dragStarted(d) {
d3.select(this).classed("dragging", true);
}
function dragged1(d) {
d.y += y.invert(d3.event.dy);
updateDots(WITHOUT_TRANSITION);
updateTimelines(WITHOUT_TRANSITION);
updateAutocorrelations(WITHOUT_TRANSITION);
}
function dragEnded(d) {
d3.select(this).classed("dragging", false);
}
</script>
x y random
1 3 -1
2 10 -1
3 10 1
4 17 0
5 3 1
6 10 2
7 10 0
8 17 -1
9 3 0
10 10 0
11 10 -1
12 17 1
13 3 0
14 10 -1
15 10 1
16 17 0
17 3 2
18 10 1
19 10 -1
20 17 1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment