TwoEmpiricalMethodsforForecastingSpotPricesand ... · Electricity demand is affected by cycli-cal...
Transcript of TwoEmpiricalMethodsforForecastingSpotPricesand ... · Electricity demand is affected by cycli-cal...
Two Empirical Methods for Forecasting Spot Prices and
Constructing Price Forward Curves in the Swiss Power
Market
A master thesis submitted to
EIDGENOSSISCHE TECHNISCHE HOCHSCHULE ZURICH
Master of Science in Management, Technology and Economics
GREGOIRE CARO
Jointly supervized by:
Centre for Energy Economics and Policy (CEPE)
Department of Management, Technology and Economics
Swiss Federal Institute of Technology
Dr. Carlos Ordas Criado
Prof. Thomas Rutherford
swissQuant Group
MSc. Marcus Hildmann
Dr. Florian Herzog
Zurich, May 2010
Contents
1 Introduction 1
2 Electricity Market Settings 5
2.1 Physical and Economic Features of Electricity . . . . . . . . . 6
2.2 Financial Approach . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.1 The Spot Price . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.2 Forwards and Futures . . . . . . . . . . . . . . . . . . . . 12
2.2.3 The Hourly Price Forward Curve . . . . . . . . . . . . . 15
2.3 The Swiss Electricity Market . . . . . . . . . . . . . . . . . . . . 16
2.3.1 Power Production . . . . . . . . . . . . . . . . . . . . . . 16
2.3.2 A Key-Location . . . . . . . . . . . . . . . . . . . . . . . 17
2.3.3 Financial Data . . . . . . . . . . . . . . . . . . . . . . . . 18
3 Literature Review 21
3.1 Load Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2 Spot and Forward Price Models . . . . . . . . . . . . . . . . . . 22
4 Estimation Methodology 24
4.1 Presentation of the Models . . . . . . . . . . . . . . . . . . . . . 25
4.1.1 A Short-Term Spot Price Model . . . . . . . . . . . . . 25
i
4.1.2 A Hourly Price Forward Curve Model for electricity
(HPFC model) . . . . . . . . . . . . . . . . . . . . . . . . 29
4.2 Regression Techniques . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2.1 The Lad-lasso Regression . . . . . . . . . . . . . . . . . 35
4.2.2 The Least-squares Support Vector Machine Regression 37
5 Empirical Analysis 40
5.1 Spot Price Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.2 Hourly Price Forward Curve Model . . . . . . . . . . . . . . . . 48
6 Conclusion 60
Appendices 61
A Two Key Ideas of Least-Squares Support Vector Machine 62
A.1 The Maximum Margin . . . . . . . . . . . . . . . . . . . . . . . 62
A.2 The Kernel Trick . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
A.3 General Framework . . . . . . . . . . . . . . . . . . . . . . . . . 64
Bibliography 66
List of Tables
2.1 The stakeholders of the power market . . . . . . . . . . . . . . 10
5.1 Backtest result for one day ahead forecast . . . . . . . . . . . . 45
5.2 Backtest results for four days ahead forecasts . . . . . . . . . . 45
5.3 Backtest results for seven days ahead forecast . . . . . . . . . . 45
5.4 Results of the falsifiability tests . . . . . . . . . . . . . . . . . . 47
5.5 Regression on the spot prices . . . . . . . . . . . . . . . . . . . . 50
5.6 Selected variables of the PFC model . . . . . . . . . . . . . . . 54
5.7 Backtesting results for the HPFC model . . . . . . . . . . . . . 59
iii
List of Figures
2.1 A mixed approach . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Structure of the electricity market . . . . . . . . . . . . . . . . . 9
2.3 The economic equilibrium of the power market . . . . . . . . . 11
2.4 Swiss Spot Price in 2008 . . . . . . . . . . . . . . . . . . . . . . 12
2.5 Two typical spot weeks in summer and winter . . . . . . . . . 13
2.6 Implied German Futures curve . . . . . . . . . . . . . . . . . . . 15
2.7 Swiss power production in 2008 . . . . . . . . . . . . . . . . . . 17
2.8 Swiss average reservoirs level . . . . . . . . . . . . . . . . . . . . 18
2.9 The Swiss power trade balance . . . . . . . . . . . . . . . . . . 19
4.1 The short term spot price model . . . . . . . . . . . . . . . . . 27
4.2 The statistical approach for the HPFC model . . . . . . . . . . 34
5.1 Grid search results for the short term spot price model . . . . 42
5.2 In-sample fits of the short term spot price model . . . . . . . . 43
5.3 A one week ahead forecast . . . . . . . . . . . . . . . . . . . . . 44
5.4 Forecasted and observed weather . . . . . . . . . . . . . . . . . 44
5.5 Training set under a seasonal transition: the first heating days 48
5.6 Training set under a vacation period: Christmas holidays . . . 49
5.7 Evolution of the coefficients . . . . . . . . . . . . . . . . . . . . 51
5.8 HPFC: determination of the hyper-parameters for ls-svm . . . 54
iv
5.9 In-sample fit of the HPFC model . . . . . . . . . . . . . . . . . 55
5.10 Extreme in-sample fits . . . . . . . . . . . . . . . . . . . . . . . . 56
5.11 Simulation output . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.12 Hourly Price Forward Curves from the lad-lasso and ls-svm . 57
A.1 The determination process of the classifier hyperplane. . . . . 63
A.2 The Kernel Trick. . . . . . . . . . . . . . . . . . . . . . . . . . . 64
A.3 The primal and the dual . . . . . . . . . . . . . . . . . . . . . . 65
Chapter 1
Introduction
The best way to predict the future is to create it.
- Peter F. Drucker(1905-2005).
Liberalization of the European electricity market was initiated in the
90’s. Its two main goals were to achieve competitive prices and to favor
the integration of the power markets across borders. Since then, the elec-
tricity companies have faced a completely new challenge. Electricity has
become a commodity traded on a European market, creating exporting facil-
ities and energy-linked financial indexes. The local supplier-customer model
has shifted toward banking business where complex financial instruments are
used to achieve an efficient inter-temporal market equilibrium.
In Switzerland, the first steps toward liberalization occurred in 1998 with
the creation of a national forward index (the SWEP), but it officially started
1
Chapter 1. Introduction 2
in January 2008 with application of the Federal Electricity Supply Act1. Be-
cause of this recent start, the market has not yet reached its maturity. For
instance, while a Swiss spot price (the Swissix) has been recently created
(2006), no financial index for futures (standardized long-term contracts) ex-
ists. The Swiss power market is therefore characterized by the lack of some
important financial series and structural specificities such as the relative im-
portance of the hydro-power plants (56,1% in 2008). Moreover, while the
literature on foreign power markets is abundant, forecasts concerning the
Swiss market are less developed. Several standard models for electricity price
forecasts have not been tested with Swiss data despite the strategic position
of the country as a major trading partner located in the center of Europe.
The aim of this thesis is to provide new insights into the Swiss electricity
prices from a financial perspective. We explore two models, one for predict-
ing electricity spot prices, the other one for estimating the current prices of
the long term contracts for electricity, the so-called Price Forward Curve2 or
PFC. As outlined by Fleten and Lemming [2003], PFCs are important infor-
mation carriers for operational and investment decisions. Practitioners often
need to estimate forward prices, i.e. the price of non-standardized futures, for
more maturity dates than are observed in the market. For that purpose, high
resolution Price Forward Curves are of great interest. We provide such an
estimate for Switzerland and construct hourly forward prices, i.e. an Hourly
Price Forward Curve (HPFC henceforth).
The spot price model estimated here is a standard autoregressive equation
which includes exogenous bottom-up components such as seasonal trends and
1Act published in the Swiss Federal Gazette the 3rd April 2007.2Some authors prefer the term Forward Price Curve.
Chapter 1. Introduction 3
temperature variables. We explore its ability to predict the Swiss spot prices
for electricity over different time horizons. Two different regression tech-
niques are tested : the least-absolute-deviation least-absolute shrinkage-and-
selection-operator (lad-lasso) and the least-squares support vector machine
(ls-svm) regression approach. These estimators have major advantages com-
pared to the standard linear regression model. Though the lad-lasso remains
a linear model, the estimation procedure is robust to outliers. Moreover, non-
significant coefficients can be ‘shrinked’ toward zero through a procedure that
optimally balances the bias-variance trade-off. Therefore, the lad-lasso re-
gression includes a variable selection mechanism that leaves the researcher
with only the most relevant predictors to interpret. These properties are
particularly interesting when the explained variable displays ‘peaks’ as it is
often the case for daily electricity prices and when a large amount of vari-
able are involved in the estimation process. From another perspective, the
ls-svm approach proposes a highly flexible estimator which performs partic-
ularly well when nonlinear patterns need to be estimated in the presence of
highly correlated explanatory factors. Its main drawback is that it is a ‘black
box’ estimator in the sense that the researcher cannot recover the original
structural parameters and disentangle the importance of each explanatory
component.
Regarding the Price Forward Curve model, an ad-hoc equation widely
employed in the financial industry is investigated with the lad-lasso and the
ls-svm regression techniques. The estimation procedure in this case requires
several steps and adjustments due to the lack of some fundamental series for
the Swiss market. In particular, Swiss futures prices are estimated based on
futures prices observed in the most influential neighboring electricity mar-
Chapter 1. Introduction 4
kets, Germany and France.
Our main conclusions are that the robust linear (lad-lasso) estimator ex-
hibits better predictive performance for Swiss electricity spot under atypical
weather conditions or when the observed price pattern departs significantly
from past trends. The ls-svm method performs better under ‘regular’ condi-
tions, i.e. when past spot price fluctuations are highly persistent. Regarding
the price curve for the long term contracts, we show that a meaningful Hourly
Price Forward Curve can be build for Switzerland by using data from the
German and French power markets. This does not come as a great surprise
given the strong correlation between the Spot electricity prices in these mar-
kets. Then we also show that the lad-lasso model highlights the particular
importance of hydro-power in Switzerland. However, the support vector ma-
chine estimates catch better seasonal variations of the futures prices.
This thesis is organized as follows. Chapter 2 discusses the theoretical
background on the power market and describes the Swiss market. A brief
review of the relevant literature is proposed in Chapter 3. Chapter 4 presents
the empirical methodology. We start with an overview of the Spot price and
Price Forward Curve models and proceeds with a brief presentation of the
lad lasso and ls-svm regression techniques. The results are given in Chapter 5
and Chapter 6 concludes.
Chapter 2
Electricity Market Settings
The models described in this paper aim at predicting the evolution of
electricity price indexes by integrating basic principles of financial theory
with structural determinants borrowed to the so-called bottom-up models
(see Fig. 2.1). The financial approach generally develops price scenarios
which depend on stochastic factors, market prices, and make use of stochas-
tic differential equation. A fundamental assumption in these models is the
existence of highly liquid markets1. This condition is not yet fulfilled in
most power markets. In addition, most of the electricity financial indexes
are heavily influenced by structural elements such as the weather or business
and seasonal cycles.
Fleten and Lemming [2003] proposed to combine all these determinants
within a single equation in order to ”compensate for the deficiencies that
arise from separate use of either market data or bottom-up models”. This
Bayesian approach is particularly relevant for Switzerland as it allows to in-
1The liquidity of a commodity is its ability to be easily traded without influencing themarket price.
5
Chapter 2. Electricity Market Settings 6
clude into the model all relevant information which may not be fully captured
by the market data available.
Figure 2.1: A mixed approach. Source: Fleten and Lemming [2003].
In this chapter we describe the structural determinants and the relevant
financial indexes for our purpose. We begin by explaining key features of
electricity and proceed with a description of two major financial products:
the spot and futures contracts. The characteristics of the Swiss power market
are explored in the last section.
2.1 Physical and Economic Features of Elec-
tricity
Compared to other commodities, electrical energy has some unique at-
tributes.
A Non-Storable Good. Electricity is economically non-storable, there-
fore most standard commodity models are inadequate to explain the elec-
tricity market. One way to store electricity in the context of hydropower
Chapter 2. Electricity Market Settings 7
generation is by pumping water into dams, but only with a significant loss.
Non-storability prevents the market imbalances to be quickly addressed by
quantity adjustments.
A Three-Fold Periodic Pattern. Electricity demand is affected by cycli-
cal components which expand over three distinct time horizons linked to
global economic activity and seasonal variations:
• a yearly cycle linked to seasonal factors. For instance the cooling days
in summer and the heating days in winter have a strong impact on the
market (see Fig. 2.4). There are also some special events taking place
every year, like holidays and vacations.
• a weekly pattern. Demand is low during weekends, high during business
days (see Fig. 2.5). Fridays usually trigger the drop in consumption of
the weekends.
• a daily profile. Demand is very low at night, and experiences a demand
peak at 7pm when everybody is back home from work (see Fig. 2.5).
The peak hours are usually defined over the time interval 8 am : 8 pm
for business days only.
Supply-side constraints. In addition to non-storability and cyclical vari-
ations, the supply of electricity faces two additional major constraints linked
to its source and its distribution network (the grid).
• Electricity has to be supplied at exactly 50Hz, 220V into the grid. A
power grid is like a river that needs to keep the same flow and level.
Therefore, if someone pumps water from it, someone else must at the
Chapter 2. Electricity Market Settings 8
same time pour the same amount of water. The interconnectivity of
the different national/regional grids is key for clearing profitably the
market: over-capacity can then be exported abroad, and under-capacity
can be avoided by importing power. Moreover, the grid has a bounded
capacity which cannot be easily increased, and thus congestion might
occur. Many congestion management tools have been put forward by
the EU. The more the national grids are inter-connected, the higher
the flexibility to avoid congestion.
• The generation capacity of a power plant is more or less flexible, de-
pending on the energy source:
– Nuclear, thermal and coal-fired plants cannot be easily stopped or
started and are basically always running. Changing its output is
a lengthy process.
– Power plants based on gas turbines or the hydroelectric dams are
easier to regulate. Gas turbines can be activated on demand at
a high cost, whereas hydroelectric dams can generate power any
time providing the water level in the reservoirs is sufficiently high.
– Solar panels, wind propellers and other ”green plants” are subject
to caution from a grid point of view. Although weather forecasts
exist, their output remains difficult to predict. In some countries
the law might force the regulator of the grid to use ”green energy”
upon any alternative source.
The Market Equilibrium
The market equilibrium results of the interplay between five main agents
(see Fig. 2.2 and Tab. 2.1) and the the time horizon plays an important role.
Chapter 2. Electricity Market Settings 9
In the short run, demand for electricity is highly inelastic to prices and it
exhibits strong variations over the day. The supply capacity being extremely
rigid, the short-term prices are essentially demand-driven. This is illustrated
in Fig. 2.3. The supply curve can be plotted as a stepwise and increasing
function where each step reflects the fixed marginal cost associated with a
specific energy source2. This particular demand/supply configuration induces
abrupt changes (peaks or valleys) in the price equilibrium. Although the
supply curve is almost fixed in the short run, it can experience exceptional
shifts due to events such as a temporary maintenance of several nuclear
power plants (left-shift of the supply on Fig. 2.3) or holidays in a neighboring
country (right shift of the supply curve). The Swiss market equilibrium can
be very sensitive to changes in the foreign markets.
Figure 2.2: Structure of the electricity market
Note finally that the price is also influenced by the transporter in charge
of the grid, who takes a small fee for the delivery service, and the electricity
retailers who make profits out of the sales to the final consumers. These
operators may also be public-owned firms which are subject to heavy political
2Green sources are in general more expensive to exploit than nuclear power for instance.
Chapter 2. Electricity Market Settings 10
pressure.
Player Actions Examples
Generator• Sells capacity to the retailers• Injects energy into the grid
Alpiq(CH), solarpanel owners
Retailers
• Buy at low prices and sell at a higherprices
• Are often engaged in a war price withother retailers
EDF(FR), Axpo(CH)
Transporter
• State-owned, result of a naturalmonopoly
• Separation to other players due to theunbundling condition of the liberaliza-tion of the market.
• Controls the grid, the auctions, theinter-border transactions...
Swissgrid(CH),National Grid(UK)
Consumers• Are free to chose their retailer• Usually, first criteria is price
Households, firms,state
Regulator
• Supervizes the liberalization of themarket
• Check that the laws (European andNational) are applied
Bundesnetzagentur(DE), CompetitionCommission (CH)
Table 2.1: The stakeholders of the power market
Chapter 2. Electricity Market Settings 11
Figure 2.3: The supply and demand curves of the power market. The demandfor electricity is highly inelastic.
2.2 Financial Approach
Electricity markets offer a wide range of standard financial instruments,
whose sophistication depends on the market maturity. The main ones are
spot, futures and forward contracts. More sophisticated derivatives, such as
options, have started to appear in more mature pools, like the Nord Pool.
There are about twenty Electricity exchanges in Europe, which cover sev-
eral countries. The EU Commission has pushed the different exchanges to
cooperate in order to improve the integration of the European electricity
market. The exchanges we are interested in are the European Power Ex-
change (EPEX) in Paris for spot prices, and the European Energy Exchange
Power Derivatives (or EEX Power Derivatives) in Leipzig. The cooperating
countries in these exchanges are France, Germany, Austria and Switzerland.
2.2.1 The Spot Price
The spot market is the one-day ahead market, where the power traded
is delivered the day after. The term ‘spot’ designates the price quoted for
Chapter 2. Electricity Market Settings 12
immediate settlement. As explained in Section 2.1, the rigidities of the power
supply are so strong that the three cyclical patterns affecting the demand
side can be easily identified in the spot price curve. We can see on Fig. 2.4
that the prices are on average higher from the end of summer vacation until
the winter vacation. The average lowest price is usually reached in spring,
when there is neither intensive cooling nor long heating periods.
Feb Apr May Jul Aug Oct Dec0
50
100
150
200
Time
Pric
e/M
wh
Figure 2.4: Swiss Spot Price in 2008
The weekly and daily variations for two typical summer/winter weeks are
shown on Fig. 2.5. Note that the level of the curves can fluctuate considerably
within a season. Both series clearly display weekly as well as intra-day cycles.
During summer, the 1pm peak corresponds to air cooling while winter is
rather characterized by the 9am and 7pm peaks due to heating. We also
notice the irregular Fridays pattern which triggers the weekends drop in
electricity demand.
2.2.2 Forwards and Futures
Forward and futures contracts are a type of financial ”derivative” prod-
uct. They consist in an agreement between two parties to deliver a certain
Chapter 2. Electricity Market Settings 13
Mon Tue Wed Thu Fri Sat Sun
20
40
60
80
100
120
140
160
180
Day
Pric
e/M
Wh
WinterSummer
Figure 2.5: Swiss electricity spot price for a typical week in summer and inwinter 2008.
amount of a good at a prearranged price (the forward/futures price). Unlike
spot contracts, forwards and futures can be continuously traded until their
delivery date. The difference between a forward and a futures contract is
that the terms and conditions for forwards are not standardized. Forwards
are negotiated to meet the specific business, financial or risk management
needs. Note also that futures are traded in an exchange while forwards are
traded over-the-counter. Before the liberalization of the power market, for-
ward contracts were the only long-term contract available in Switzerland.
Assuming the arbitrage-free condition3 is checked, the classic theory of
rational pricing establishes a relationship between the forward price and the
value of its underlying (storable) asset4:
Ft = S0(1 + rf + storage costs − convenience yield)t (2.1)
3An arbitrage occurs when one can make a profit out of the difference in prices in twomarkets. As an example, if the forward price was not to converge to the spot price atthe delivery date, there would be an arbitrage opportunity. This convergence property isassumed to hold in all models presented in this paper.
4See Harris [2006].
Chapter 2. Electricity Market Settings 14
where Ft refers to the market price of a forward delivered in date t, S0 is
the current spot price and rf corresponds to the risk-free rate. This formula
proposes a theoretical link between the price of buying a storable commod-
ity and storing it for further usage and the price of buying a right (forward
contract) of getting the commodity when needed in the future. This equality
simply states that the owner of the commodity must be compensated for all
incurred costs. Note that a commodity bought now can be used any time and
therefore provides a convenience yield. The difference between the storage
costs and the convenience yield is called the net convenience yield. Since
electricity cannot be stored, the net convenience yield becomes the conve-
nience yield which is difficult to estimate if the terms of the contracts are not
standardized [Carmona and Ludkovski, 2004].
Electricity futures contracts are defined according to three main criteria:
the delivery date, the length of the delivery period and a daily component
(8am-8pm vs. 24/7). Delivery starts at the beginning of either a month, a
quarter (January, April, July and October) or a year5. Peak futures are de-
livered only between 8am and 8pm whereas base futures are delivered 7/24.
Note that country-specific standards exist. In Germany, the upper limits for
each delivery range are seven months, seven quarters and six years ahead.
In France, these limits are three months, four quarters and three years ahead.
Fig. 2.6 shows a series of typical Peak and Base futures prices where the
horizontal regions indicate delivery period intervals. We employ a representa-
tion called ”implied Futures curve” which allows to build a none overlapping
and continuous sequence of prices for futures with different delivery periods.
5The only exception is for the first monthly contract which covers a delivery for theon-going month.
Chapter 2. Electricity Market Settings 15
It consists in plotting the price of the futures contracts by starting with con-
tracts with the shortest delivery period over the permitted trading horizon
(monthly delivery starting at the beginning of months 1 to 7 for the Ger-
man case), followed by those with longer delivery ranges (quarterly contracts
starting in the next 2nd to 7th quarters and finally yearly contracts delivered
in next 2nd to 6th year).
Aug09 Dec10 May12 Sep13 Feb15 Jun1630
40
50
60
70
80
90
100
Pric
e/M
Wh
Futures BaseFutures Peak
Figure 2.6: Implied German Futures.
2.2.3 The Hourly Price Forward Curve
Futures curves such as the one in Fig. 2.6 provide information about
the long-term expectations of the market players. However, products with
maturities exceeding 6-12 months are rather scarce and interpolating a small
amount of points with similar (but not identical) characteristics is not an
optimal way to price futures. Looking at forward prices would be of little
help. As noted by Fleten and Lemming [2003], forward contracts are usually
exchanged in large chunks which involves problems for finding prices for
specific maturity times or for the more general purpose of constructing a high-
resolution term-structure curve. This is why futures or forward curves gain at
being complemented with high-resolution Price Forward Curves. The Hourly
Price Forward Curves (HPFC) is such a curve and it describes the prices as
Chapter 2. Electricity Market Settings 16
of today for the delivery of electricity at each hour in the future. It represents
the term-structure of forward prices in hourly resolution. It is important to
remember that the HPFC is not a forecast of the spot price. One can hardly
say whether today’s futures price will reflect the spot price at the delivery
date. This depends on many uncertain factors such as the global economic
climate, oil prices and agents irrational behaviour. Therefore, assessing the
quality of the generated forward curves is not straightforward.
2.3 The Swiss Electricity Market
Compared to other West-European electricity markets, the Swiss market
has three major differences: the relative importance of hydraulic power, its
particular position in the middle of Europe and its relative youth.
2.3.1 Power Production
Due to the federal system, the Swiss Power Market is highly fragmented.
About 900 companies are active in production and retailing, most of them
working at a cantonal or regional scale. However, only one transporter is in
charge of the grid: Swissgrid. Like 75% of the power utilities, Swissgrid is
state-owned.
With 56 % of electricity produced from hydropower plants, Switzerland
has a particularly high share of hydropower in its total electricity output.
Combined with 5 nuclear plants representing about 35% of the total produc-
tion, Switzerland has almost CO2-free power generation.
Chapter 2. Electricity Market Settings 17
Figure 2.7: Swiss power production in 2008. Source: Swiss Federal Office forEnergy - Schweizerische Gesamtenergiestatistik 2008.
2.3.2 A Key-Location
The location of Switzerland in the heart of Europe makes it an important
transit country. The main transit axis is the French-Italian one (see Fig. 2.9).
Switzerland has been for years a net power exporter, but has become in 2007
a net importer. With a total production of 64 TWh and a total consumption
of 63 TWh in 2008, the Swiss trade balance of energy is almost null, but there
are strong seasonal and daily variations. Switzerland cannot rely entirely on
hydropower from winter to late spring because of the snow that starts melt-
ing only in spring (see Fig. 2.8). The reservoirs reach their lowest level in
May and get filled in summer when precipitations are important. In order
to keep a safety margin for spring in case of winters with excessive heating,
the Swiss market is a net importer in winter (-4.5 TWh). By contrast, once
the reservoir starts to fill up again in late spring Switzerland becomes a net
exporter until the end of summer (+5.5 TWh for summer 2008).
Interestingly enough, the large hydropower capacity of Switzerland give
it a comparative advantage with respect to its neighbors. Since the only way
to store electrical power is by pumping the water up to the reservoirs, and
since water plants can deliver electricity at any time, Switzerland has become
Chapter 2. Electricity Market Settings 18
Dec07 Apr08 Jul08 Oct08 Jan09 May09 Aug090
0.2
0.4
0.6
0.8
1
Time
Leve
l (%
)
Figure 2.8: Average reservoirs level in Switzerland. The filling is done inSummer, when precipitations are the most important. To avoid shortage inSpring the Swiss market is a net importer in Winter. Source: Swiss FederalOffice of Energy - Schweizerische Gesamtenergiestatistik 2008.
a supplier for the EU-neighbours at peak hours. Indeed, as one can hardly
slow down a nuclear or a thermal plant, the energy price drops significantly
during off-peak hours. Swiss generators take advantage of this situation by
pumping water into their dams during off-peak hours and sell their ‘stored’
electricity during peak hours. With a 8.5 TWh water storage capacity, Swiss
generators have enough capacity to meet demand from home and from abroad
during peak hours. This profit scenario applies especially to France which
relies heavily on nuclear power: as shown in Fig. 2.9, France is the only net
exporter of electricity to Switzerland. This interdependence between France
and Switzerland will be exploited to build the HPFC in section 5.2
2.3.3 Financial Data
The only financial indexes available for the Swiss power market are the
Swiss Electricity Price Index (SWEP) and the Swiss Electricity Index (Swis-
six). The SWEP is a local indicator of one-day-ahead over-the-counter prices.
It was launched in 1998 and it became the first wholesale electricity price in-
dex published on the European continent. The Swissix is the average price
Chapter 2. Electricity Market Settings 19
Figure 2.9: Trade balance (yearly-based and winter-based). Source: Centrefor Energy Policy and Economics, Swiss Federal Institute of Technology -Report: Electricity gap; ways to face the challenge.
at the European Energy Exchange (EEX) for next-day deliveries in the Swiss
grid, hourly based, with base and peak series. It was launched in 2006. Rel-
atively to the SWEP it has a wider range and is not affected by local effect6.
The Swissix is the reference spot price for this study.
Although these two indicators give a good idea of the historical evolution
of electricity prices, they capture only around 10%7 of the Swiss market. The
remaining 90% are over-the-counter contracts (not referenced by the SWEP),
largely influenced by public service obligations and therefore with biased, un-
referenced prices. This is a usual problem often encountered in the rest of
the European countries as well.
Finally, the main issue with the Swiss financial data, relatively to France
and Germany, is the absence of futures products. The only financial data
we possess are the spot price which is past-oriented. so that there is no
6Since there is no official price for over-the-counter trading, the SWEP is just thevolume-weighted average at the 380-kV Laufenburg’s grid hub.
7IEA [2007].
Chapter 2. Electricity Market Settings 20
long-term expectations indicators. The models developed in the literature to
build Price Forward Curves therefore cannot be directly applied. This issue
is addressed in Section 5.2.
Chapter 3
Literature Review
Since the beginning of the liberalization of the European Energy market,
papers related to predicting energy consumption or prices have flourished
with the purpose of improving risk management. The HPFC is a key tool
to optimize power plants production capacities and better estimate firms’
upcoming income. Energy producers can also use load models to make real-
time scheduling of electricity generation. In this literature review, we focus
on the papers related to price forecasting and the construction of forward
curves.
3.1 Load Models
Given the strong correlation existing between the electrical load and the
electricity spot prices, load models can help in identifying the main drivers
of electricity prices. These models generally assume a deterministic path and
employ hourly data to forecast up to seven days ahead. Based on load data
from Brazil, Soares and Medeiros [2008] compares a purely stochastic-trend
model (SARIMA-type) with an autoregressive model with a flexible deter-
21
Chapter 3. Literature Review 22
ministic trend component (TLSAR-type) and conclude that a deterministic-
based approach performs better for short run forecasts1. They also find no
evidence that nonlinear models are better in terms of predictive performance.
Therefore. capturing explicitly the deterministic trend seems to be important
for short-run load forecasts. These authors do not include weather variables
in their model but they emphasize that they can significantly improve the fit.
Taylor [2008] uses a very short-term model (ten minutes ahead) based
only on past load data. He finds that for forecasts longer than four hours
ahead, models with weather variables are superior to the purely autoregres-
sive models based on last week data. He then emphasizes the importance of
the accuracy of the weather forecasts for the predictive performance.
Finally, Amaral et al. [2008] compared different linear and non-linear
methods, based on Australian load data. They propose a specific treatment
for special days like holidays, and point out that non-linear methods can be
more efficient than linear ones for short-term forecasting (one day ahead)
while basics linear models are better for longer time-spans.
3.2 Spot and Forward Price Models
Most of the energy price models consist in short-term prediction models
of the price on the spot market (the day-ahead market). They usually focus
on the most liberalized markets, where data are abundant and rigidities are
low, such as the North Pool market (which includes Sweden, Norway and
Finland). Weron and Misiorek [2008] use this market to contrast paramet-
1SARIMA stands for Seasonal Integrated Autoregressive Moving Average and TLSTARis a Two-Level Seasonal Autoregressive model.
Chapter 3. Literature Review 23
ric versus semi-parametric models and show that the semi-parametric models
perform better and are more sensitive to exceptional market conditions (peak
demand, weather conditions). They conclude that autoregressive models for
short-term forecasts on an hourly basis are the best in terms of predictive
power.
Not all price models focus on short-term predictions. Some recent mod-
els aim at constructing forward prices, i.e the Price Forward Curve itself.
Based on the Nordic market, Fleten and Lemming [2003] use bid-ask data
of futures products to construct the long-term product and emphasize that
the method performs well in the range of four to ten months ahead. Jump-
diffusion models are also often employed to capture the spiky behavior of
the spot price. However, Chan et al. [2008] argue that traditional financial
approaches (like the jump-diffusion model) are unsuccessful in capturing the
spot price dynamics.
Chapter 4
Estimation Methodology
As outlined in Chapter 2, electricity prices display multiple cyclical be-
haviors which must be taken into account in the modeling process. These
structural determinants (e.g.: daily/monthly/yearly seasonal patterns) may
be heavily correlated among themselves and with other independent vari-
ables (e.g. weather indicators). Moreover, daily transactions generate a small
amount of peak values (extreme spot prices) which may influence excessively
the fits. Section 4.1 presents the spot price and the forward price models
while section 4.2 details the regression methods used to estimate them. The
lad-lasso estimator combines regularization techniques (shrinkage regression)
that allow to control for the bias-variance trade-off with a robust approach
(LAD minimization). The ls-svm regression is particularly appropriate to
estimate non-linear relationships in the presence of strongly correlated pre-
dictors.
24
Chapter 4. Estimation Methodology 25
4.1 Presentation of the Models
In this section we describe two models based on the mixed approach in-
troduced by Fleten and Lemming [2003], see page 6. The short-term spot
price model is directly inspired from the short term vertical load model de-
scribed in Espinoza et al. [2007]. This HPFC model has been developed by
swissQuant Group and Axpo and it corresponds to a model widely used in
the power industry.
4.1.1 A Short-Term Spot Price Model
Methodology
Espinoza et al. [2007] use an ls-svm regression technique to estimate an
autoregressive equation (called ar-lssvm in JAK and Vandewalle [2000]) for
predicting short term loads. Their model mixes a load dynamics component
(autoregressive part) with daily trends and weather indicators to predict fu-
ture loads. Although vertical load series display a similar pattern to the spot
price, the former ones are smoother and less noisy in general.
Let’s consider the following model:
yt = f(xt) + et,
where yt denotes the electricity price at time t (each hour), f(xt) is an
unknown (possibly non-linear) function and xt ∈ Òn is the regressors’ matrix:
xt = {yt−1, ..., yt−j ,Ht,Dt,Wt} ,
Chapter 4. Estimation Methodology 26
with
• Wt: weather forecasts indicators composed of the temperature FTt and
heating and cooling indicators defined respectively as FHt =max(18 −Tt,0) and FCt =max(Tt − 20,0);
• Dt ∈ {0,1}7 a binary-valued vector which captures the effects of each
day of a week;
• Ht ∈ {0,1}24 a binary-valued vector which captures the effects of each
hour in a day.
The parameter j denotes the size of the auto-regressive part. Let ∆ be the
number of days forming the auto-regressive part (j = 24 ×∆). When hourly
spot price curves are estimated, hourly temperature forecasts are needed in
Wt. In general, temperature for the forthcoming days is predicted in terms
of expected mean, maximum and minimum (as it is the case in Switzerland).
Hourly forecasts can be reconstructed in some way with the help of profiles
stemming from temperatures observed on a hourly basis in the past. The
global estimation procedure is illustrated on Fig. 4.1.
Validation of the Model
Performance Indicators Since our spot price model provides spot price
forecasts, standard indicators can be used to assess the quality of the fit out-
of-the-sample (this is called ‘backtesting the model’ in finance). Denoting the
observed data y, the fit y, the number of observations N and the arithmetic
mean of z as z, the following four performance indicators are considered :
• The correlation coefficient:
σy,y
σyσy
= ∑i (yi − y)(yi − ¯y)√∑i(yi − y)2
√∑i (yi − ¯y)2
,
Chapter 4. Estimation Methodology 27
Figure 4.1: The short term spot price model framework.
• The mean error:1
N ∑i ∣yi−yi∣1
N ∑i yi× 100,
• The mean absolute prediction error:
1
N∑i
∣yi − yi∣∣yi∣ ,
• The mean standard deviation error:
var(y − yy) × 100 = 1
N
N
∑i=1
(ǫi − ǫ)2 × 100,
with y > 0 and ǫ the relative error: ǫi = yi−yiyi
.
Falsifiability Tests In addition to the statistical measures of fit, we sub-
mit the spot model to a ‘falsifiability test’ a la Popper. Indeed, the true
data generation process of the spot prices is expected to depend on weather
Chapter 4. Estimation Methodology 28
predictions, among other determinants. The reason for this is that the trans-
actions on the spot market are settled one day before delivery. The traders
use weather forecasts to make up their mind on the quantity to purchase
or sell and at which price. Therefore we would expect the spot model to
perform better when weather predictions for the next day are used as predic-
tor as compared to the use of, say, the true weather observed the next day.
Indeed several weather scenarios can be tested against the weather forecasts
variable in that perspective:
• the observed (true) weather
• a normalized weather (seasonal weather)
• a random weather.
Letting ℘(.) be the prediction accuracy of a model, we expect the follow-
ing preference order to hold
℘(forecasted weather) ≻ ℘(seasonal expectations) ≻ ℘(random walk weather).
We also expect that
℘(forecasted weather) ⪰ ℘(observed weather).
In the latter preference order, the identity relationship arises in case of perfect
weather forecasts. These tests may be useful to discriminate two estimation
methods which perform similarly in terms of out-of-sample predictions.
Chapter 4. Estimation Methodology 29
4.1.2 A Hourly Price Forward Curve Model for elec-
tricity (HPFC model)
A fundamental component of the HPFC is futures price series. No fu-
tures products for electricity exist in Switzerland. However, the German and
French markets propose a small set of them. Figure 2.6 in page 15 shows the
only forward curve that can be observed in the market. Note the September
and the winter peaks which correspond to periods of the year where firms
need to secure power supply. After the first seven months, we notice that the
curve becomes flat and its shape provides no precise guidance on future fluc-
tuations (see [Espinoza et al., 2006]). Note the upward trend of long-term
contracts. The further away from the delivery date, the higher the cost of
hedging.
The construction of the HPFC relies on a combination of characteristics
extracted from the spot price series and observed futures prices. The sea-
sonal, weekly and daily variations are taken from the historical spot prices,
whereas the average values of the HPFC is provided by observed futures prod-
uct. In other words, the guideline is to apply a complex coefficients structure
to the observed futures prices1 to obtain hourly values. This approach relies
on structural links between the short-term (spot) and long-term markets.
Although equation 2.1 does not apply formally to electricity products, we
assume that an approximate link between the spot and the forward prices
exists.
1The construction of the futures curve is described in page 15.
Chapter 4. Estimation Methodology 30
Methodology
Our goal is to estimate P (t, h) the ”hourly forward price” for every day
t and hour h over the time horizon defined by the futures curve. The Peak
and Base futures curve can be denoted by (FBasei , F Peak
i ) with i ∈ [1,NF ]and where NF represents the number of products used to build the curve.
These two curves can be alternatively represented as a single stepwise func-
tion fluctuating between Peak and Base values. The latter representation is
denoted F (t, h). In order for the HPFC to be fully consistent with F (t, h),the following arbitrage-free conditions must hold
E(t,h)[P (t, h)] = FBasei for (t, h) ∈ TFBase
i
(4.1)
E(t,h)[P (t, h)] = F Peaki for (t, h) ∈ TFPeak
i
(4.2)
where TF represents the time horizon of a specific contract represented
in the implied curve. In order to capture the seasonal patterns in the high-
resolution representation of the price forward curve, we introduce the hourly
coefficient s(t, h) and set
P (t, h) = F (t, h) × s(t, h),
where s(.) must comply with conditions (4.1) and (4.2). To determine s(.),we adopt a bottom-up approach which captures the weather factors and
daily and seasonal components likely to influence F (.). The details of the
four steps procedure for building s(.) are outlines below:
1. We first estimate the following regression over two years of data and we
get m, i.e. daily estimates of the spot price based on purely bottom-up
Chapter 4. Estimation Methodology 31
components:
S(t) =m(W (t),D(t),M(t)) + ǫt,with
• S(t): historical daily spot prices;
• W (t): daily weather consisting of the mean, the highest and the
lowest temperature of the day, the precipitation, the wind-speed,
the relative humidity, and the heating and cooling indicators;
• D(t): matrix of dummy variables with 0/1 values for each different
day of the week. A single variable is used for Tuesday, Wednesday
and Thursday and holidays are treated as Sundays.
• M(t): matrix of dummy variables with 0/1 values for each differ-
ent month of the year.
2. The second step consists in an out-of-sample simulation over a time
horizon given by the time range of the implied futures curve, where
each component of W (t) is set to its daily mean value over the last
40 years to capture expected values for the season and where the daily
and monthly indicators (Dfuture(t),Mfuture(t)) are projected over the
pertinent time horizon :
S(t)future =m(Wnorm(t),Dfuture(t),Mfuture(t)).
3. In the third step, we transform the daily predicted values S(t)futureinto hourly profiles. This is done with the coefficients p(t, h) which are
build from daily means of hourly spot prices observed during the last
two years2. Note that for each day, the profile is normalized so that
2Here, we use month-day clusters. As an example, an historical profile for the Mondays
Chapter 4. Estimation Methodology 32
the following condition is met3:
∀t,Eh[p(t, h)] = 1.
Then, an estimate for s(.) can be obtained by setting
s(t, h) = S(t) × p(t, h).
4. In the last step, once the predicted St are hourly-based, they are cal-
ibrated or adjusted to each of the steps of the implied futures curve,
so that they fluctuate around these steps and match an arbitrage free
condition. More precisely, for each futures product (e.g. Month 2 Peak
product4), we normalize the coefficients s(t, h) to ˆsTF(t, h) in order to
fulfill the following arbitrage-free conditions:
E(t,h)∈TF[sTF(t, h)] = 1.
Recall that TF is the time horizon of a specific contract represented
in the implied futures curve, i.e the length of one of the steps in the
implied futures curve. Denoting F (t, h) the price of the futures for thedelivery hour h at day t and P (t, h) the estimated HPFC at day t and
hour h, we have:
P (t, h) = F (t, h) × ˆsTF(t, h).
of September is obtained by averaging for each hour all spot values the eight Mondays ofSeptember for the last 2 years.
3This is done by simply dividing the mean hourly-based daily profiles by their meanover the whole day.
4Starting arbitrarily in September 15, 2009, since the first monthly product concernsthe current month, the Month 2 Peak product would have a delivery period from the 1stto the 31st of October, from Monday to Friday, 8 am to 8 pm.
Chapter 4. Estimation Methodology 33
Some words on the arbitrage-free condition used at stage 4 in the HPFC
procedure may be necessary. This condition guarantees that the mean value
of the HPFC over the delivery period of a certain product is equal to the
product value itself. The global estimation steps of the HPFC are illustrated
on Fig. 4.2.
Validation of the HPFC Model
Since the HPFC is not a prediction of the spot price, evaluating the
quality of the fit is not straightforward. A qualitative assessment can be done
based on criteria such that the plausibility of HPCF shape, i.e. its ability to
capture seasonal patterns usually observed with high-frequency price series,
or stylized facts such that low/high prices during holiday/working periods.
In addition, the first stage of the HPFC estimation procedure provides results
that can be contrasted with facts observed in the market under study. Finally,
the HPFC can also be tested in the short run in the same spirit as the spot
price model. Indeed, for a long term contract close to its delivery date, the
non-arbitrage condition implies convergence toward the spot price and we
expect the contract value to be close to the spot price.
Chapter 4. Estimation Methodology 34
Figure 4.2: The statistical approach for the HPFC model
Chapter 4. Estimation Methodology 35
4.2 Regression Techniques
4.2.1 The Lad-lasso Regression
According to Tibshirani [1996], the ordinary least squares estimator (OLS)
suffers from two main drawbacks:
• its lack of accuracy: OLS estimates achieve low bias at the expense of
a high variance. The prediction accuracy can be improved by tolerat-
ing a small amount of bias, i.e chopping the less significant factors or
”shrinking” some coefficient toward zero.
• Interpretation: as all factors have non-zero coefficients, a smaller subset
of coefficients (with the strongest effect) may suffice to achieve a model
that performs well and provide sensible interpretation.
One of the main ideas to address these two drawbacks is to introduce
a Tikhonov regularization term [Tikhonov, 1963] that controls the bias-
variance trade-off in the regression. This procedure has originated the ridge
regression, a variant of the least squares technique intended to fight the effects
of collinearity among the regressors. More recently Tibshirani [1996] intro-
duced the Lasso (Least absolute shrinkage and selection operator), which
combines in a single procedure the nice features from ridge regression and
subset selection, i.e. a continuous process of ”shrinking coefficients” that re-
sults in setting some coefficients to zero. Later, Wang et al. [2007] suggested
a robust regression method based on a combination of the least absolute de-
viation and the lasso, giving the lad-lasso. The use of the absolute deviation
instead of the squared deviation makes the model more robust to outliers,
and the lasso guarantees a good variable selection and therefore a good bias-
variance trade-off.
Chapter 4. Estimation Methodology 36
Consider the usual linear model equation:
Y = Xβ + ǫ,
where β is the unknown coefficients vector, X represents the matrix of the
explanatory variables, Y is the response or explained variable and ǫ is an
error term. The lad-lasso optimization problem is given under its canonical
form by:
β(s) = argmin∑j∈[1,p] ∣βj ∣≤s
∣Y −Xβ∣,where p is the dimension of the factor matrix X and s is the regularization
coefficient which controls the amount of shrinkage that is applied to the
estimates. This parameter can be determined by cross-validation5. However,
since there are p tuning parameters, the search can be computationally heavy.
To address this issue, Wang et al. propose the BIC-type objective function :
n
∑i=1
∣yi − xiβ∣ + n p
∑j=1
λj ∣βj ∣ − log(5nλj)log(n),
which avoids the lengthy cross-validation process and sets the λj parameters
to
λj = log(n)n∣βj ∣ ,
where n is the number of sample points and λj is the tuning parameters
controlling the shrinkage. It corresponds to a lasso relatively ‘tight’ and
therefore very robust. The code used for the lad-lasso has been developed by
swissQuant Group.
5A numeric criteria measuring the forecasting power of the regression.
Chapter 4. Estimation Methodology 37
4.2.2 The Least-squares Support Vector Machine Re-
gression
A major development in the field of non-linear regression is the statis-
tical learning theory developed by Vapnik. This author gives a framework
based on the concept of empirical risk minimization (how to minimize the
loss in data modeling from an empirical sample), leading to the support vec-
tor machine (svm) theory. Suykens and Vandewalle [1999] propose a slightly
modified version of this theory, the least-squares svm, which introduces the
least squares term in the optimization problem. Support vector machine is
currently considered as being the state-of-the-art technique in classification
problems. Subsequent developments include the least-squares svm (ls-svm)
with a symmetric part [Espinoza et al., 2005] or the fixed-size least squares
svm [Espinoza et al., 2006]. It is important to point out that the ls-svm
estimator behaves like a black box, i.e. the non-linear transformations in-
volved do not allow to recover parameters associated with each predictor.
The philosophy here is therefore very different compared to the lad-lasso.
For the computation, the toolbox LS-SVMLab developed by Suykens et al.
[2002] was used.
Overview of the Objective Function
The key idea of the ls-svm regression is to map the regression space into a
higher dimensional space and find a linear hyperplane with the help of kernel
functions (the so-called kernel trick). We introduce briefly here the objective
function, but the optimization problem as well as the two key ideas of the
ls-svm theory are presented in the Appendix A.
Chapter 4. Estimation Methodology 38
Let’s consider the sample of points (xt, yt) where t ∈ [1,N], M indepen-
dent variables (∀t ∈ [1,N], xt = [xt1, ...xtM ]) entering the unknown function
f , and let
yt = f(xt) + et, t ∈ [1,N].In the absence of prior information about the structure of f(), this func-
tion can parametrized in a primal space based on ls-svm, that is
yt = ωTϕ(xt) + b + etwhere ω ∈ Ò is an unknown coefficient vector and b a bias term. The
feature map ϕ is unknown and transforms the input data into a higher di-
mensional vector. Introducing the least squares cost function, we can define
the objective function of the problem and its constraints:
minω,b,et
1
2ωTω + γ 1
2
N
∑i=1
e2t ,
with yt = ωTφ(xt) + b + et.The parameter γ is a regularization parameter.
As in the lad-lasso, we use a Tikhonov regularization. Introducing Lagrange
multipliers and using Mercer’s theorem (see Appendix A), the ls-svm theory
shows that f() can be expressed in terms of a positive-definite kernel function
K() without having to compute the feature map ϕ. The new objective
function expressed in this dual space is:
yt =N
∑t=1
αiKσ(xt, xi) + b + et,
where σ is an exogenous parameter. The kernel function can be any standard
kernel, like the polynomial or the uniform kernel. In our application, we use
Chapter 4. Estimation Methodology 39
the Gaussian radial basis function kernel given by
K(x, z) = exp(− ∣∣x − z∣∣22σ2
) .
The estimates of the model depend heavily on the two hyper-parameters
(γ,σ).
Determination of the Optimal Parameters (γ,σ)
The two parameters (γ,σ) control the bias-variance trade-off. If the bias
is low the fit on the training set will be very good but the variance of the
prediction might be very high. The better the fit, the higher the sensitivity
to outliers and the risk of overfitting6. Increasing the smoothness of the fits
surely reduce the variance but increases the bias. There is therefore an opti-
mal pair (γ,σ) to find.
One popular way to optimize the predictive power of the model for a given
set of parameters is cross-validation (CV). So once the kernel matrix has
been found, the leave-one-out cross-validation score can be easily obtained
for different pairs (γ,σ) defined over a grid. Another method proposed by
Keerthi et al. [2007] is to perform a gradient search on the same grid, like
a Newton’s algorithm. According to Cawley and Talbot [2007], the cross-
validation criteria prevents overfitting if there are only two hyper-parameters.
For more than two parameters, they recommend a Bayesian regularization
approach. In this paper, we use cross-validation to find the optimal pair.
6Overfitting occurs when a regression captures irrelevant features of a particular sample.
Chapter 5
Empirical Analysis
The main goal of the thesis is to apply the models presented in the pre-
vious section to Switzerland. This chapter describes the data employed in
each model as well as the results obtained with the lad-lasso and the ls-svm
approaches. Note that all data come from Bloomberg.
5.1 Spot Price Model
The equation of the spot price model is given by:
St = f(St−1, ..., St−j ,Ht,Dt,Wt).
The first decision we need to make is about the time horizon and the fre-
quency of the observations. We estimate hourly spot prices based on 30 days
data taken from August 1-31, 2009, i.e. the St vector is of size 24 × 30. We
use the Swiss spot price index traded on EPEX (the Swissix).
Regarding the weather indicator, the only hourly-based temperatures
40
Chapter 5. Empirical Analysis 41
available in Switzerland are those measured at Geneva and Zurich airports.
The mean of these two series is used as the Swiss hourly temperature indi-
cator. We build hourly temperature forecasts by combining the latter mean
with the expected daily mean, maximum and minimum temperatures from
MeteoSuisse1.
The daily dummy matrix Dt defined in Section 4.1.1 is used with the
following slight modification: Swiss national holidays are coded as Sundays.
Finally, the spot model is tested for different lag orders and its predic-
tive performance is evaluated over various forecasting horizons (1, 4 and 7
days ahead). Before presenting the empirical results, we analyze the cross-
validation procedure for the ls-svm regression model.
Cross-Validation Score for the ls-svm Model
The cross-validation score of the ls-svm fits is computed for a given grid
of plausible values for the pair (γ,σ). Fig. 5.1 presents the results for two
different specification of the spot model: the left-hand side plot refers to
an auto-regressive model with 24 lags (a full day), while the right-hand side
model has no lag. We notice that the auto-regressive component is key to get
accurate in-sample predictions. Though the global shape of the CV function
is quite similar, the CV magnitudes are much lower for the auto-regressive
1Denoting Thigh(d) and Tlow(d) the minimum and maximal predicted temperaturesfor day d, z(h) the hourly mean temperature for each hour over the 30 training days andFT (d,h) the hourly forecasts, we have:
FT (d,h) =(Thigh − Tlow)(d)
maxh z(h) −minh z(h)× (z(h) −min
hz(h)) + Tlow(d).
Note that z(h) is been first normalized by dividing each observed hourly temperature bythe mean over the 30 days.
Chapter 5. Empirical Analysis 42
model.
0200
400600
8001000
0200
400600
8001000
26
28
30
32
34
36
38
sigmalambda
0
200
400
600
800
1000
0
200
400
600
800
1000
195
200
205
210
215
220
sigmalambda
Figure 5.1: Grid search for a training set (a) with a one day autoregressivepart and (b) without autoregressive part
Lad-lasso vs Ls-svm In-sample Fits
We notice in Fig. 5.2 that the in-sample fit is excellent for both the lad-
lasso and ls-svm when an auto-regressive part is included in the model. The
visual inspection of the graphs also seems to indicate that increasing the lag
order does not improve spectacularly the fit.
Out-of-sample Predictive Performance
A typical out-of-sample prediction is shown in Fig. 5.3. The prediction
starts on 10/30 and ends on 11/3. November 1st and 2nd correspond to a
Saturday and a Sunday. We notice that the fit performs well in the short-run
(day one and two), but the week-end peaks are not well captured. A more ac-
curate picture of the out-of-sample predictive performance of the spot model
is given in Tables 5.1 to 5.3 for the lad-lasso and ls-svm regressions and for
different lag orders over three forecast horizons. It is important to note that
the performance indicators are mean values. In order to be able to evaluate
the predictive performance of the model more robustly, we performed 1, 3,
and 7 days-ahead predictions every three days starting from Sept 1, 2009,
Chapter 5. Empirical Analysis 43
(a)
(b)
(c)
Figure 5.2: In-sample fit with parameters (15,900) for three window-sizes (inday): (a) ∆ = 0, (b) ∆ = 1, (c) ∆ = 4.
until Dec 15, 2009. This gave us 33 measures for each performance indicator
whose average2 is reported in Tables 5.1 and 5.3.
We first notice on Tables 5.1 to 5.3 that ls-svm performs generally bet-
2We are aware that a confidence interval may have been provided.
Chapter 5. Empirical Analysis 44
Figure 5.3: A typical one week ahead forecast with parameters: (∆, γ, σ) =(1,50,500)ter than lad-lasso whatever the performance indicator. Second, when the
forecast horizon is extended, the quality of the prediction increases if the
length of the auto-regressive part is also increased. Finally, the best per-
formance in terms of correlation and mean standard deviation is given by
the ls-svm 1-day-ahead/2-lags model, the ls-svm 4-days-ahead/2-lags model
has the lowest MAPE while the lowest mean error is given by ths ls-svm
7-days-ahead/4-lags model.
28−Oct 30−Oct 01−Nov 03−Nov 05−Nov 07−Nov 09−Nov
0
2
4
6
8
10
12
14
16
18
Time
Deg
ree
Cel
sius
Weather ForecastObserved weather
Figure 5.4: The weather forecast and the observed weather (Fall 2009). By
construction, the peak values of the forecast model are the one day ahead
forecast.
Chapter 5. Empirical Analysis 45
Lags (∆)Performance
0 Day 2 Days 4 Days 6 Daysladlasso lssvm ladlasso lssvm ladlasso lssvm ladlasso lssvm
Correlation 0.82 0.83 0.83 0.85 0.83 0.84 0.83 0.84Mean Error (%) 21.68 21.57 20.50 18.09 20.27 18.79 21.04 20.39MAPE 5.63 5.49 5.50 4.72 5.46 4.92 5.55 5.35Mean Std Dev. (%) 19.87 19.75 19.62 18.17 19.54 18.58 19.75 19.46
Table 5.1: Forecast one day ahead
Lags (∆)Performance
0 Day 2 Days 4 Days 6 Daysladlasso lssvm ladlasso lssvm ladlasso lssvm ladlasso lssvm
Correlation 0.81 0.82 0.80 0.83 0.79 0.82 0.80 0.82Mean Error (%) 20.73 21.05 19.31 17.88 19.43 18.73 20.05 19.35MAPE 5.28 5.30 4.91 4.58 4.95 4.79 5.04 4.97Mean Std Dev. (%) 22.50 22.27 21.68 20.11 21.79 21.06 21.51 21.48
Table 5.2: Forecast four days ahead
Lags (∆)Performance
0 Day 2 Days 4 Days 6 Daysladlasso lssvm ladlasso lssvm ladlasso lssvm ladlasso lssvm
Correlation 0.78 0.77 0.79 0.79 0.78 0.80 0.76 0.80Mean Error (%) 18.51 18.76 19.46 17.87 19.83 17.82 21.87 19.23MAPE 5.54 5.14 5.27 4.81 6.23 4.90 6.20 5.26Mean Std Dev. (%) 21.41 20.75 21.25 20.50 22.77 20.17 23.88 21.26
Table 5.3: Forecast seven days ahead
Falsifiability test
As indicated in Section 4.1.1, we can apply various falsifiability tests to
check whether or not the model behaves as the true model should do. The
most powerful way to verify this hypothesis is by testing :
℘(forecasted weather) ⪰ ℘(observed weather).
We also test the two less stringent conditions described in Section 4.1.1.
Two independent quality indicators are used for these falsifiability tests: the
Chapter 5. Empirical Analysis 46
MAPE and the correlation coefficient. We apply the following rule:
⎧⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎩
℘(A) ≻ ℘(B) if the two indicators of A perform better,
℘(A) ∼ ℘(B) if only one indicator of A performs better,
℘(A) ≺ ℘(B) if none performs better.
Fig. 5.4 simply compares the true or the predicted hourly-based tem-
perature series with those observed a posteriori. As expected, the observed
temperature exhibits a more wiggly pattern due to the variations induced by
the wind, precipitation and snowfall. The results of the falsifiability tests are
presented in Table 5.4 for both the auto-regressive ls-svm and the lad-lasso.
The ls-svm model passes successfully the test while the lad-lasso model per-
forms better when the observed temperature is used as predictor. We can
therefore state that the ls-svm estimator is likely to capture the true data
generation process. Regarding the lad-lasso results, we notice that the model
performs better when real temperatures are employed as predictor in lieu of
temperature forecasts. Therefore, this estimation technique seems less ade-
quate in terms of the falsifiability criteria. This seems to indicate that the
true model is non-linear.
Performance under transitions/vacation periods
All the above results were obtained with a dataset that mixes regular
patterns with more noisy ones such as those encountered during transition
phases from fall to winter or official/local holiday periods. As an example,
the regression based on September values may include a no-heating period,
but the forecasts are done for the heating period. Another example is the
forecasts for early January which are based on a training set that includes
Christmas and New Year’s Eve and erratic patterns in between.
Chapter 5. Empirical Analysis 47
AR-lssvm:
• ℘(Forecast one day ahead) ≻ ℘(Observed (real) weather)• ℘(Forecast one day ahead) ≻ ℘(Seasonal expectations)• ℘(Seasonal expecations) ≻ ℘(Random walk weather)
Lad-lasso:
• ℘(Forecast one day ahead) ≺ ℘(Observed weather)• ℘(Forecast one day ahead) ≻ ℘(Seasonal expectations)• ℘(Seasonal expecations) ≻ ℘(Random walk weather)
Table 5.4: Falsifiability test: results
Empirical evidence show that under these particular periods, the ls-svm
approach can perform very bad whereas the lad-lasso is more robust and
displays lower volatility. And this is indeed what we find in our data. In
Fig. 5.5, the presence of the heating days at the beginning of the simulation
set in Exhibit (a) biases the forecasts in Exhibit (b), though the lad-lasso fits
are better than the ls-svm ones. Another example is given in Fig. 5.6, where
the training set is includes Christmas holidays. The past spot price displays
a pretty irregular pattern (note the midnight peak of New Years Eve) due to
the drop in many economic activities. Again the ls-svm fits are misled while
the lad-lasso is closer to the true curve.
Chapter 5. Empirical Analysis 48
(a)
(b)
10/08 10/09 10/10 10/11 10/12 10/13 10/14 10/15 10/16 10/17 10/18 10/190
20
40
60
80
100
120
Pric
e [E
UR
/MW
h]
Ls−svmLad−lassoObserved
Figure 5.5: Training set under a seasonal transition: the first heating days.
5.2 Hourly Price Forward Curve Model
The data used to perform the pertinent regressions for the HPFC cover
the period Jan 2007 to Dec 2009. More precisely, given that the estimates
for the HPFC start the 1st of January 2009, we use the two previous year of
data as training set. Then the 2-years windows changes as the HPFC horizon
is increased. The variables involved in our estimation are the following:
• The Swiss, German and French spot prices, hourly-based, in Euro/MWh
from EPEX.
• The German and French futures contracts, base, peak, monthly, quar-
terly and yearly-based, traded in Euro/MWh from EEX.
• The daily-based Swiss weather indicators which comprise eight compo-
nents: the mean, highest and lowest temperature; the relative humidity
level, precipitation in mm, the wind speed in m/h and the cooling and
Chapter 5. Empirical Analysis 49
12/27 12/28 12/29 12/30 12/31 01/01 01/02 01/03 01/04 01/05 01/06 01/070
20
40
60
80
Pric
e [E
UR
/MW
h]
Ls−svmLad−lassoObserved
Figure 5.6: Training set under a vacation period: Christmas holidays
the heating days defined as follows:
Cooling Days =max(Mean Temp. − 18,0),
Heating Days =max(18 −Mean Temp.,0).The seasonal variations are calculated out of the historical data of the
40 past years.
• The Swiss national average reservoirs level in meters.
• The Swiss, French and German national holidays. A holiday is handled
as a Sunday in the regression (in the day indicator dummy matrix).
HPFC Estimation Steps for Switzerland
The HPFC model presented in section 4.1.2 can be applied in all markets
which possess spot and futures data. Since there is no futures contracts in
Switzerland for electricity, we need a plausible model to estimate their price.
The simplest way to do it to use information from the most influential neigh-
Chapter 5. Empirical Analysis 50
boring electricity markets, i.e. the French and German ones (see Fig. 2.9
page 19 or Section 2.3.2). We use the relationship between the Swiss spot
prices for electricity and the French and German spot prices to build futures
contracts for Switzerland under an arbitrage-free condition. The underlying
assumption of this approach is that the impact of the French and German
spot markets on the Swiss spot market would translate symmetrically the the
Swiss futures contract if they were to exist. As official holiday in Germany
and France also influence the spot price in Switzerland, the specific effect of
official holiday is also taken into account.
The relationship between the Swiss electricity spot prices and the German
and French electricity spot prices, all daily-based, is given by
SCH = α0 +αDESDE +αFRSFR + βCHHCH + βCHHDE + βFRHFR (5.1)
where SX and HX denote the country-specific spot prices and dummy
variables which capture holiday attributes. A lad-lasso regression is per-
formed over the period 2007/12/11 to 2009/12/10 and we get the following
results:
Variables SDE SFR HCH HDE HFR constCoefficients 0.22 0.78 1.90 0 -0.58 3.20Student’s t 45.60 175.70 5.41 ∅ 0.99 22.71
Table 5.5: Results of the regression (R2 = 0.77)
We notice in Table 5.5 that 77% of the total variance is explained by
this regression. The Swiss spot price is much more sensitive to the French
Chapter 5. Empirical Analysis 51
spot price than to the German spot price. This is also confirmed by the very
high t-statistic value of the associated with SFR. Moreover, the coefficient
for the German holiday variable happen to be shrunk to zero and is therefore
dropped out of the regression. The French holidays have a decreasing impact
on the Swiss spot price but note that the coefficient is not significant at the
10% level. The negative impact can be easily explained by the fact that the
French nuclear plants cannot be stopped for for short periods of time, so that
France experiences a temporary over-supply during the French holidays that
can be sold to Switzerland. The negative sign of the Swiss holidays’ coeffi-
cient was not expected and is pretty difficult to interpret given that holidays
in Switzerland are set at the canton’s level.
We further investigate the stability of coefficients of the French and Ger-
man spot price (αDE , αFR) for 2-years and 100-days periods ending from
2009/01/01 to 2009/12/10.
Jan09 Mar09 May09 Jun09 Aug09 Oct09 Nov090
0.2
0.4
0.6
0.8
1
Coe
ffici
ents
Calculation date
DE − 100 days regression
FR − 100 days regression
DE − 2 years regression
FR − 2 years regression
Figure 5.7: Evolution of the coefficients
Fig. 5.7 shows that the 2-years-based coefficients remain pretty stable and
Chapter 5. Empirical Analysis 52
fluctuate close to the values given in Table 5.5. In contrast, the 100-days-
based coefficients exhibit strong variations but the changes remain pretty
smooth. This indicate that the sensitivity of the Swiss spot price to the
French and German spot markets is not constant over time. Interestingly
enough, the steady drop of the 100-days-based coefficient of the French spot
from end 2008 until March 2009 coincides with the most productive period
of the German wind turbines. Then, the increase in both the French and
German 100-days-based coefficients after March 2009 correspond to a period
of pretty empty dams in Switzerland (see Fig. 2.8). Finally the Oct. 2009
change in both the German and French 100-days and 2-years spot coefficients
occurs at a time when France shut down exceptionally several power plants
for maintenance reasons.
The relationship 5.1 is then exploited to build Swiss futures by setting:
FCH(t) = αDEFDE(t) + αFRFFR(t) + α0. (5.2)
In the following, we apply the 4-stages procedure in conjunction with
equation 5.2 to build the Swiss HFPC. Recall that this procedure requires
estimating the bottom-up determinants of the spot prices with a regression
technique (see stage 1 of the procedure presented in section 4.1.2). We use the
lad-lasso and the ls-svm regressions and compare their results. Therefore, two
HPFC are estimated with the two regression techniques. In the following, we
comment some interesting results of the spot regressions without mentioning
stages 2-4 because they are of little interest.
Chapter 5. Empirical Analysis 53
Bottom-up Model for the Spot Prices: the Lad-lasso Variables Se-
lection
Since the lad-lasso estimator of the bottom-up spot model performs a
variable selection procedure, it is interesting to show the list of the spot
predictors with none-zero coefficients retained in the case of Switzerland,
Germany and France. Note that the estimates of the German and French
spot models are of no use for building the Swiss HPFC. Table 5.6 presents
the results. Globally, it may be surprising to see that the mean temperature
is selected in none of the spot models. In case of strong correlation between
the weather regressors, we should expect the drop of some highly correlated
determinants. However, we notice that some country specificities are well-
captured by the lad-lasso estimator. Germany, with 24 GWh produced in
2008, is one of the major producers of wind energy in the world (behind the
United States). The lad-lasso has selected wind speed as a key predictor of
the German spot prices. For Switzerland, the average level of the national
dams has been selected among many other determinants. Since France relies
heavily on nuclear power, it is harder to interpret the selection of only one
weather variable, heating. In the three cases, the week-end days Saturday
and Sunday are selected. It means the lad-lasso correctly singles these days
out.
Bottom-up Model for the Spot Prices: Lad-lasso vs Ls-svm In-
sample Fits
Before turning to the in-sample performance of the lad-lasso and ls-svm
estimators, note that the coefficients (γ,σ) for the ls-svm regression are deter-
mined with a grid search. The cross-validation function is shown in Fig. 5.8.
We notice the wide flat zone with a CV score almost constant.
Chapter 5. Empirical Analysis 54
Switzerland:
• Saturday, Sunday
• September, December
• Heating Days, Cool-ing Days, Reservoirslevel
France:
• Saturday, Sunday
• June, September
• Heating Days
Germany:
• Saturday, Sunday
• September
• Heating Days, WindSpeed
Table 5.6: Variable selection: results
100
200
300
400
0500
10001500
2000
0.041
0.042
0.043
0.044
0.045
0.046
0.047
sigma
Grid search results
lambda
Leav
e−on
e−ou
t CV
sco
re
Figure 5.8: Determination of (γ,σ).
The in-sample fits are shown on Fig. 5.9. The regression being daily-
based, we can identify the U-inverted shape weekly pattern, the top values
being the week days and the lowest values being week-ends. These patterns
Chapter 5. Empirical Analysis 55
are later converted to hourly profiles for the HPFC estimates.
Figure 5.9: In-sample fit
We also notice on Fig. 5.9 that the fit is better with the ls-svm regression
than with the lad-lasso one, particularly for the cooling-days and the winter
peaks. The lad-lasso produces smoother estimates. We can also distinguish
the seasonal and weekly variations for both curves, as well as the impact of
the weather predictors. Regarding the latter effect, we clearly see that the
spot prices are lower in winter 2008 (Q1-08) than in winter 2009 (Q1-09)3.
The only difference in the training set between these two seasons lies in the
weather indicators. This underlines the ability of both regression methods to
capture well the impact of the weather. A forecast based only on past trends
(monthly clustering) would result in similar levels for the Q1-08 and Q1-09
fits.
As already mentioned in the methodological section, the choice of the
parameters (γ,σ) is key to optimally balance the bias and variance of the
fits. This trade-off is illustrated in Fig. 5.10. A very small σ would give a
perfect fit and at the cost of bad out-of-sample performance. Inversely a high
σ and a low γ would lead to a curve shaped like the lad-lasso or totally flat
3Winter 2009 was colder than winter 2008.
Chapter 5. Empirical Analysis 56
for extreme values.
Figure 5.10: Extreme behaviour of ls-svm: (a) perfect fit achieved with a σ
of 1(γ = 1000) and (b): horizontal curve with a σ of 50,000 (γ = 10).
03−Mar−2010 19−Sep−2010 07−Apr−2011 24−Oct−2011 11−May−2012 27−Nov−2012 15−Jun−2013
0.7
0.8
0.9
1
1.1
1.2
1.3
Time
Wei
ghtin
g co
effic
ient
Output of the regression on the simulation test
LinearNon−linear
Figure 5.11: Simulation Output from the HPFC Estimation Procedure.
In the second and third stages of the HPFC estimation procedure, sim-
ulation are performed with the bottom-up model for Swiss spot prices and
Chapter 5. Empirical Analysis 57
a normalization is performed. Fig.5.11 shows the resulting patterns for the
lad-lasso and ls-svm regressions once the data have been normalized. The
impact of each variable selected by the lad-lasso appears very clearly in the
related curve, while the pattern for the ls-svm estimator is much more wiggly.
The jumps are linked to the seasonal indicators and the sinusoidal shape is
due to the weather indicators. The application of the arbitrage-free condi-
tions smooths the curve.
(a)
(b)
Figure 5.12: Hourly Price Forward Curve get from (a) the lad-lasso and (b)
ls-svm. The second curve looks simply more realistic. On average, these two
curves are both equals to the monthly, quarterly and yearly Futures.
Chapter 5. Empirical Analysis 58
HPFC curves
Fig. 5.12 shows the HPFC obtained when the lad-lasso (a) and ls-svm (b)
estimators are applied at the first stage of the estimation procedure. Let’s
first focus on the lad-lasso HPFC in (a). The first thing to notice is the sea-
sonal pattern: the prices of the long term contracts are high in winter and low
in summer. Second, we clearly see a September peak over the whole period
estimated (cluster of high peaks before the end of the years 2011, 2012, 2013
and 2014). These peaks are expected given that September corresponds to
the return period from the summer vacation and all sectors of the economy
are reactivated. Note however that this peak may be over-amplified due to
the exceptionally high prices reached in September 2009.
Regarding the ls-svm HPFC on Fig. 5.12(b), we notice first that the
global shape is similar to the one derived with the lad-lasso. The main dif-
ference comes from the fact that the ls-svm HPFC possess more pronounced
monthly variations. The Christmas drop and the cooling-days appear very
clearly. May and August prices, when neither cooling or heating are usually
required, are both very low. The September peak is also present but to a
less explicit compared to the lad-lasso results.
Based on these consideration, our impression is that the methodology
proposed to estimate the Hourly Price Forward Curve for Switzerland pro-
vides a meaningful shape which captures the most prominent stylized facts
that characterize the Swiss electricity prices.
Finally, according to the backtest results reported in Table 5.7, ls-svm
performs better than the lad-lasso. The vast majority of the quality indi-
Chapter 5. Empirical Analysis 59
cators for the ls-svm fits outperform the lad-lasso ones whatever the fore-
casting horizon. Note also departures of the forward prices from the spot
prices increase as the forecasting horizon increases. This is expected as the
convergence of the forward value toward the spot price is expected hold in
the short run.
Forecasting range1D ahead 5D ahead 30D ahead
ladlasso lssvm ladlasso lssvm ladlasso lssvm
Correlation 0.82 0.82 0.76 0.78 0.75 0.76
Mean Error (%) 24.19 23.35 25.90 25.02 25.73 25.10
MAPE 6.12 5.9 6.52 6.33 6.66 6.43
Mean Std Dev. (%) 17.78 17.70 22.26 21.50 25.90 25.40
Table 5.7: Backtesting results
Chapter 6
Conclusion
In this master thesis, two different models have been employed to predict
electricity prices in Switzerland, a spot price model and a model for estimat-
ing hourly prices for long term contracts (the so-called Hourly Price Forward
Curve or HPFC). The spot price model consists in a standard autoregressive
equation with bottom-up variables (weather indicators and seasonal vari-
ables). Two recent regression techniques, the least absolute deviation lasso
(lad-lasso) and the least squares support vector machine estimator (ls-svm),
have been used for the estimation. The former method is more robust than
the standard OLS estimator and includes a variable selection procedure while
the latter one is more appropriate in the presence of nonlinearities but ex-
clude structural interpretations.
Our results for the spot price model indicate that ls-svm regression out-
performs the lad-lasso estimator in terms of out-of-sample performance. This
holds over the three forecasting horizons investigated, i.e. 1, 3 and 7 days
ahead. However, the lad-lasso is more reliable when the predictions are done
over holiday periods or during transitions between seasons. Therefore, these
60
Chapter . Conclusion 61
two estimators are complementary.
Regarding the HPFC curve, we overcome the fundamental problem of the
absence of prices for standardized long term contracts (futures) in Switzer-
land by using German and French futures prices and building Swiss prices
through a statistical relationship based on spot prices. We show that the
HPFC obtained exhibits a meaningful shape. The multi-step procedure used
to build the HPFC for Switzerland also indicates that underlying relation-
ships clearly capture important stylized facts linked to the Swiss, the German
or French electricity markets. For example, the lad-lasso regression outlines
the great impact of the reservoirs levels or the wind speed on the electricity
spot prices in Switzerland and Germany respectively. However, the ls-svm
regression technique appears again as being superior to the lad-lasso for build-
ing the most reliable HPFC curve.
Clearly, further work remains to be done before using our HPFC for pric-
ing long term contracts or hedging in the Swiss electricity market. Here,
we show that in the absence of some fundamental financial instruments in a
recently liberalized electricity market, we can combine arbitrage-free condi-
tions and simple statistical relationships with other interconnected electricity
markets to build fundamental missing financial instruments.
Appendix A
Two Key Ideas of
Least-Squares Support Vector
Machine
A.1 The Maximum Margin
The support vector machine theory is historically a classification theory.
It suggest to separate two classes with an the hyperplane with the maximum
margin. The margin is the distance between the separation hyperplane and
the closer sample points. This definition implies that some points are not
useful. The closest points to the classifier hyperplane are called support
vectors. On Fig.A.1 H1 and H2 are both linear classifiers, but H1 has the
farthest margin to the closest sample points. According to the theory, H2 is
therefore a more reliable classifier. The circled points become the support
vectors. This classifier is easy to find for a 2 dimensional plane, but its
usefulness is limited if it applies only to linearly separable data. When the
sample size and the dimensions increase, the classifier is less obvious. There
62
Chapter A. Two Key Ideas of Least-Squares Support Vector Machine 63
comes the second main idea of the support vector machine theory: the Kernel
Trick.
Figure A.1: The determination process of the classifier hyperplane.
A.2 The Kernel Trick
This trick was introduced first by Aizerman in 1964 and consists in map-
ping the original non-linear observations into a higher dimensional space
where a linear classifier is available. Finding a linear classifier in this dual
space is equivalent to finding a non-linear classifier in the primal space. To
illustrate the procedure, one can consider the mapping of a power function in
a log-scale and in a regular scale. In Fig. A.2 a group of data is surrounded by
a different group. Since a linear classifier would not be able to separate these
two distinct groups, the problem is projected into a higher-dimensionality
space. We go from a two-dimensional problem to a three-dimensional prob-
lem with the non-linear mapping function
Φ ∶⎧⎪⎪⎪⎨⎪⎪⎪⎩
Ò2→ Ò3
(x1, x2)→ (x2
1,√2x1x2, x
2
2)
Chapter A. Two Key Ideas of Least-Squares Support Vector Machine 64
Figure A.2: The Kernel Trick.
A.3 General Framework
The mapping of the sample points into the higher-dimensional space is
done with symmetric and semi-positive Kernel functions with bandwidths
σ. The sample points are projected on a kernel-based algebra with the help
of Mercer’s theorem. The optimization problem is then solved in this dual
space with a Thikonov regularization (with parameter γ). It avoids to derive
the initial non-linear function ϕ. The return to the primal space is done with
the Nystrom approximation, which is a scalar product with the kernel base
vectors. The framework is illustrated in Fig. A.3.
Chapter A. Two Key Ideas of Least-Squares Support Vector Machine 65
Figure A.3: The primal-dual approach of the LS-SVM. Source:
Suykens and Vandewalle [1999]
.
Bibliography
Amaral, L., R. Souza, and M. Stevenson (2008). A smooth transition
periodic autoregressive (STPAR) model for short-term load forecasting.
International Journal of Forecasting 24(4), 603–615.
Carmona, R. and M. Ludkovski (2004). Spot convenience yield models
for the energy markets. In Mathematics of finance: Proceedings of an
AMS-IMS-SIAM Joint Summer Research Conference on Mathematics of
Finance, June 22-26, 2003, Snowbird, Utah, Volume 351, pp. 65. AMS
Bookstore.
Cawley, G. and N. Talbot (2007). Preventing over-fitting during model se-
lection via Bayesian regularisation of the hyper-parameters. The Journal
of Machine Learning Research 8, 861.
Chan, K., P. Gray, and B. Van Campen (2008). A new approach to char-
acterizing and forecasting electricity price volatility. International Journal
of Forecasting 24(4), 728–743.
Espinoza, M., J. Suykens, R. Belmans, and B. De Moor (2007). Electric Load
Forecasting. IEEE Control Systems Magazine 27(5), 43–57.
Espinoza, M., J. Suykens, and B. De Moor (2005). Imposing symmetry in
least squares support vector machines regression. In IEEE CONFERENCE
ON DECISION AND CONTROL, Volume 44, pp. 5716. IEEE; 1998.
Espinoza, M., J. Suykens, and B. Moor (2006). Fixed-size least squares
support vector machines: A large scale application in electrical load fore-
casting. Computational Management Science 3(2), 113–129.
66
Fleten, S. and J. Lemming (2003). Constructing forward price curves in
electricity markets. Energy Economics 25(5), 409–424.
Harris, C. (2006). Electricity Markets: Pricing, Structures and Economics.
Wiley Finance.
IEA, O. (2007). Energy policies of iea countries: Switzerland 2007 review.
Technical report, International Energy Agency.
JAK, S. and J. Vandewalle (2000). Recurrent least squares support vector
machines. IEEE Transactions on Circuits and Systems 47(7).
Keerthi, S., V. Sindhwani, and O. Chapelle (2007). An efficient method for
gradient-based adaptation of hyperparameters in SVM models. Advances
in Neural Information Processing Systems 19, 673.
Soares, L. and M. Medeiros (2008). Modeling and forecasting short-term
electricity load: A comparison of methods with an application to Brazilian
data. International Journal of Forecasting 24(4), 630–644.
Suykens, J., T. Van Gestel, J. De Brabanter, B. De Moor, and J. Vandewalle
(2002). Least squares support vector machines. World Scientific Pub Co
Inc.
Suykens, J. and J. Vandewalle (1999). Least squares support vector machine
classifiers. Neural processing letters 9(3), 293–300.
Taylor, J. (2008). An evaluation of methods for very short-term load fore-
casting using minute-by-minute British data. International Journal of
Forecasting 24(4), 645–658.
67
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso.
Journal of the Royal Statistical Society. Series B (Methodological) 58(1),
267–288.
Tikhonov, A. (1963). Solution of incorrectly formulated problems and the
regularization method. In Soviet Math. Dokl, Volume 4, pp. 1035–1038.
Vapnik, V. Statistical learning theory. 1998.
Wang, H., G. Li, and G. Jiang (2007). Robust regression shrinkage and
consistent variable selection through the lad-lasso. Journal of Business
and Economic Statistics 25(3), 347–355.
Weron, R. and A. Misiorek (2008). Forecasting spot electricity prices: A com-
parison of parametric and semiparametric time series models. International
Journal of Forecasting 24(4), 744–763.
68