Johan Gustafsson - DiVA portal

39
UPTEC F15 025 Examensarbete 30 hp Juni 2015 Finding potential electroencephalography parameters for identifying clinical depression Johan Gustafsson

Transcript of Johan Gustafsson - DiVA portal

Page 1: Johan Gustafsson - DiVA portal

UPTEC F15 025

Examensarbete 30 hpJuni 2015

Finding potential electroencephalography parameters for identifying clinical depression

Johan Gustafsson

Page 2: Johan Gustafsson - DiVA portal

Teknisk- naturvetenskaplig fakultet UTH-enheten Besöksadress: Ångströmlaboratoriet Lägerhyddsvägen 1 Hus 4, Plan 0 Postadress: Box 536 751 21 Uppsala Telefon: 018 – 471 30 03 Telefax: 018 – 471 30 00 Hemsida: http://www.teknat.uu.se/student

Abstract

Finding potential electroencephalography parametersfor identifying clinical depression

Johan Gustafsson

This master thesis report describes signal processing parameters ofelectroencephalography (EEG) signals with a significant difference between the signalsfrom the animal model of clinical depression and the non-depressed animal model.The signal from the depressed model had a weaker power in gamma (30 - 80 Hz)than the non-depressed model during awake and it had a stronger power in delta (1.5- 4 Hz) during sleep.

The report describes the process of using visualisation to understand the shape of thesignal which helps with interpreting results and helps with the development ofparameters. A generic tool for time-frequency analysis was improved to cope with thesize of the weeklong EEG dataset.

A method for evaluating the quality of how well the EEG parameters are able toseparate the strains with as short recordings as possible was developed. This projectshows that it is possible to separate an animal model of depression from an animalmodel of non-depression based on its EEG and that EEG-classifiers may work asindicative classifiers for depression. Not a lot of data is needed. Further studies areneeded to verify that the results are not overly sensitive to recording setup and tostudy to what extent the results are translational. It might be some of the EEGparameters with significant differences described here are limited to describe thedifference between the two strains FSL and SD. But the classifiers have reasonablebiological explanations that makes them good candidates for being translationalEEG-based classifiers for clinical depression.

ISSN: 1401-5757, UPTEC F15 025Examinator: Tomas NybergÄmnesgranskare: Alexander MedvedevHandledare: Mia Lindskog

Page 3: Johan Gustafsson - DiVA portal

Populärvetenskaplig sammanfattning på svenskaDepression är en allvarlig och vanlig sjukdom. Det är den vanligaste orsaken till handikapp enlight värld-shälsoorganisationen WHO. En komplikation vid behandling av depression är avsaknaden av en tydligdefinition och att symptomen för flera olika psykiska sjukdomar överlappar med symptom på depressionoch det kan därför vara svårt att hitta rätt behandling baserat på symtom. Befintliga antidepressivaläkemedel fungerar för ca 50% av patienterna men har liten eller ingen effekt för andra. Utvärderingenav symptomen hos människor är också i sig subjektiv vilket gör det svårare att på ett tillförlitligt sättupprepa kliniska diagnoser. Exakta medel för diagnos baserad på inspelningar av mönster i hjärnansaktivitet skulle kunna hjälpa till att ge mer exakta medel för behandling. Denna studie syftar till attbeskriva potentiella biomarkörer för klinisk depression baserat på EEG-inspelningar från råttor.

Projektet beskriver signalbehandlingsparametrar för elektroencefalografi-signaler (EEG) med en sig-nifikant skillnad i signalerna från en djurmodell för klinisk depression (FSL) och i signalerna från en icke-deprimerade djurmodell (SD). Signalen från den deprimerade modellen var starkare i gamma-bandet(30 ´ 80 Hz) än den icke-deprimerade modellen under vaket tillstånd och var starkare i delta-bandet(1.5´ 4 Hz) under sömn.

Den data som används i detta projekt är begränsad till inspelningar från endast två stammar, en icke-deprimerad modell och en deprimerad modell. Så i bästa fall denna studie kan bara föreslå potentiellabiomarkörer eftersom det inte finns några uppgifter att redovisa andra egenskaper som är gemensammaför andra tillsånd som också kan påverka hjärnans aktivitet på ett likartat sätt. Studen kan heller inteurskilja stamskillnader som inte är relaterade till depression men fortfarande skiljer stammarna åt.

Denna studie använde tidigare inspelade data från ett annat projekt för vilket djuromsorg och ex-periment genomfördes i enlighet med protokoll som lämnats in till och godkänts av den lokala etiskakommittén i Stockholm Norra, Sverige.

För att skilja på signalparametrar (såsom signalstyrka) under olika sömntillstånd togs en metod framför automatiserad klassificering av sömntillstånd. Automatiserad klassificering av sömntillstånd är intenytt men befintliga metoder använder vanligtvis signaler från både hjärnaktivitet och muskelaktivitet föratt skilja vaket tillstånd från djup REM-sömn vars hjärnaktivitet påminner om den under vaket tillstånd.

En annan parametrar som skiljde de två djurmodellerna åt var deras kronotyp. Kronotyp angerbenägenheten att sova under en viss tid på dygnet. Den deprimerade modellen hade en dygnsrytm somvar skiftad en timme framåt jämfört med den icke-deprimerade modellen.

Studien utgick från 8 råttor i varje grupp och undersökte deras EEG främst under ett dygn (24timmar) från varje råtta. Båda grupperna sov totalt cirka 50% av de 24 timmarna, men den deprimeradesov totalt lite mer än den icke-deprimerade. Under natten sov de omkring 70% av tiden och omkring30% av tiden under dagen. Men den deprimerade modellen sov något mindre under natten och någotmer under dagen. Men det var också en markant skillnad i stabiliteten hos sömnmönster under dagen.Under dagen hade den icke-deprimerade modellen ett stabilt mönster av mestadels vaket som avbrytsav flera korta men regeulbundna period av sömn. Den deprimerade modellen sov däremot inte bara merutan sömnperioderna var oregelbundna och betydligt längre.

Signalstyrka kan mätas på flera sätt. Två vanliga metoder är MAD och RMS. MAD utmärker siggenom att inte vara känslig för extremvärden. Kvoten RMS / MAD är ett stabilt mått som lätt kantestas på en annan signal. Kvoten var 1.36 för den deprimerade modellen och 1.41 för FSL. En liten mensignifikant skillnad som räckte för att skilja grupperna åt.

En viktig del av arbetet med detta projekt var användningen av interaktiv visualisering av tids-frekvensrepresentationer för att orientera sig i datan och kunna tolka resultaten.

Det gjordes även ett försök att studera effekterna på depressionsmarkörerna av en ketaminbehandlingmen det gav inget eftersom det var för få djur och sannolikt för höga doser.

Resultaten innebär att EEG-signaler från de två djurmodellerna är olika i flera avseenden som kantas fram genom signalbehandling av signalen för att få ut parameterskattningar. Det är möjligt att endel av dessa parametrar kan översättas till biomarkörer för klinisk depression hos människor. Eftersommetoden av EEG är icke-invasiv och EEG utrustning är lätt tillgänglig i klinisk praxis är det möjligt attsöka efter sådana biomarkörer hos människor också i ett framtida projekt.

3

Page 4: Johan Gustafsson - DiVA portal

Master thesisEngineering Physics at Uppsala University

Thursday 11th June, 2015 12:32

Finding potential electroencephalographyparameters for identifying clinical depression

Johan Gustafsson

ContentsGlossary 5

1 Introduction 61.1 The EEG signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.2 Animal models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.3 Interactive time-frequency analysis - Freq . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.4 Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2 Method 92.1 Data material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.2 Notations and common equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.3 Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.3.1 Data management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.4 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.4.1 Artefact removal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.4.2 Power normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.4.3 Loglognormal spectrogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.4.4 Spectrogram decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.4.5 Sleep scoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.4.6 Subset segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.5 EEG parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.6 Strain classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.6.1 Parameter decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.6.2 Strain classifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242.6.3 Classifier quality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.7 Ketamine effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3 Results 263.1 Notation of results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263.2 Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263.3 Sleep scoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273.4 EEG parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303.5 Strain classifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313.6 Ketamine effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4

Page 5: Johan Gustafsson - DiVA portal

4 Discussion 334.1 Freq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.2 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344.3 Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344.4 Sleep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344.5 Strain separation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354.7 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

5 Acknowledgments 35

6 References 35

Appendix 38A Time-frequency plots of recorded data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

A.1 Sprague Dawley . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38A.2 Flinders sensitive line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

GlossaryAPI

Application programming interface, An API defines a set of routines for building software appli-cations.

biomarker

A trait used for identification and/or diagnostication.

clinical depression

Clinical depression is a severe form of depression. In this text the term depression refers to clinicaldepression.

delta

EEG activity in the frequency range 1.5´ 4 Hz. See table 3.

EEG

Electroencephalography, measures brain activity through electrodes.

EMG

Electromyography, measures muscle activity through electrodes.

FSL

Flinders sensitive line, a strain of rats used as a model depressed behaviours.

gamma

EEG activity in the frequency range 30´ 80 Hz. See table 3.

iEEG

Intracranial electroencephalography, measuring brain activity through electrodes placed inside thebrain.

OpenGL

The open graphics library is an industry standard API for hardware accelerated rendering of 2Dand 3D graphics.

5

Page 6: Johan Gustafsson - DiVA portal

qEEG

Quantitative electroencephalography, the study of brain activity through processing informationfrom many electrodes simultaneously.

RAM

Random access memory, The RAM is the computer memory used to run active programs andkeep their data. The RAM puts a limit to how much data the computer can have readily accessiblewithin a few nanoseconds. The size is typically around a couple of GB.

SD

Sprague dawley, a strain of normal lab rats used as a model for non-depressed behaviour.

translational

Applicable on different species, notably both humans and rats.

1 IntroductionThis project describes signal processing parameters of electroencephalography (electroencephalography(EEG)) signals with a significant difference between the signals from the animal model of clinical depres-sion and the non-depressed animal model. The signal from the depressed model had a stronger power ingamma (30 ´ 80 Hz) than the non-depressed model during awake and it had a stronger power in delta(1.5´ 4 Hz) during sleep.

The dataset used in this project is limited to recordings from only two strains, a non-depressed modeland a depressed model. So at best this study can only suggest potential biomarkers as there is no datato account for other properties that are common to other conditions that could also affect the brainactivity in a similar way nor to account for strain differences that are unrelated to depression but stillseparate the strains.

This study used previously recorded data from another project for which animal care and experi-mentation were performed in compliance with protocols submitted to and, approved by the local ethicscommittee of Stockholm North, Sweden.

Depression is a severe and common disease, that represent the leading cause of disability world-wide[1]. A complication in treating depression is the lack of a clear definition, and that symptoms for severaldifferent psychiatric conditions overlap with the symptoms of depression and it can therefore be hardto find the correct treatment based on symptoms. Existing antidepressants drugs work for about 50%of patients but have little to no effect for others [2, 3]. Evaluation of symptoms in humans are alsoinherently subjective which makes it harder to reliably repeat clinical assessments. Accurate means ofdiagnosis based on recordings of patterns in brain activity could aid in providing more accurate meansof treatment. This study aims to describe potential biomarkers of clinical depression in EEG recordingsfrom rats.

This project was carried out in the Lindskog Laboratory [4] at the department of Neuroscience atKarolinska Institutet.

1.1 The EEG signalAn EEG signal is created by measuring the electric potential between two electrodes placed either outsidethe skull (usually what is referred to as EEG) or inside the skull or deep within the brai, (usually referredto as intracranial electroencephalography (iEEG). The signal recorded is the sum of the electrical activityof the individual neurons of the brain. The activation of a neuron causes ion currents across the neuroncell membrane, which create a small bipolar electric field. Figure 1 shows this field next to a firing neuron.The signal measured outside the skull is the accumulated effect of many neurons firing synchronously.

EEG rhythms are typically studied within a set of frequency bands named by letters from the greekalphabet, see table 3. Each band is notable for being associated to a specific type of brain activityand are translational, meaning that the same bands occur for the same type of different activity indifferent mammals regardless of brain size. The first EEG was recorded in 1875 by Richard Caton in

6

Page 7: Johan Gustafsson - DiVA portal

Figure 1: The effect of a firing cell on the electric potential in the vicinity of a cell. A reference electrodeis placed far away from the cell. Each spike graph shows the normalized electric potential measured at thatlocation where the deepest red color is 1000 times stronger than the deepes blue color. From a distancea firing cell can be modelled by a dipole. The EEG captures the accumulated effect of many cells firingsynchronously. The overlayed spike graphs spans 5 ms each. The underlying picture of the cell spans650ˆ 650 µm. Image from of E. W. Schomburg, California Institute of Technology, USA.

England [5] and Lemere published the first EEG findings related to depression in 1936 [6]. Lemere foundcorrelations with increased alpha power and increased alpha power is to date still considered a hallmarkof depression. In 1973 d’Elia narrowed this further to show increased alpha activity near the forehead(prefrontal cortex) in patients diagnosed with clinical depression when the patient is awake with theireyes closed [7]. This has however not been clear enough to be used as a classifier for individuals whileseparating the signal from other diseases. This also means that it cannot alone account for comorbidity[8].

EEG patterns are largely conserved across mice, rats, dogs, non-human primates and humans [9],with the exception of the alpha band that is represented by different frequencies in differet species. [10,9]. The conservation of frequency bands suggests that biomarkers developed for rats will be translational,i.e that the results from this study will also be applicable on human trials in subsequent studies.

An EEG is often recorded with multiple electrodes, giving different signals between different pairs ofelectrodes on the head. The varying intensities between the signals can be overlayed on an image of ahead to create a brain map. Computationally intensive algorithms for studying the vast amount of datagenerated by a collection of such signals is referred to as quantitative electroencephalography (qEEG).This study has also employed computationally intensive algorithms, but to study an EEG from just twoelectrodes. This report does not use the term qEEG as there has been some dispute as to whether ananalysis of a 2-electrode EEG should be called qEEG [11].

Some established biomarkers in EEG for clinical depression require long recordings and compares theaverage power in main frequency bands. For example, sleep patterns are disturbed during depressionwhich shows up clearly in a long recording of an EEG. In addition qEEG studies in patients withold-age depression found increased slow wave activity [12]. Differences in qEEG indicators were foundeven between unipolar and bipolar depressive disorders [13]. Abnormal qEEG indicators have beenfound to consistently predict therapeutic response [14, 15]. Prefrontal theta cordance could become anobjective marker of change of depressive symptoms, independent of patients’ compliance and symptomdissimulation, more precise than objective and self-rated depression rating scales [16].

However, if there are distinct features that could be detected during a few minutes during wake itwill be implementable in the primary care, where depression is most of the diagnosed.

The focus in this study is to develop and study the quality of biomarkers that can be used on shorterrecordings during non-sleep. A biomarker that requires a shorter and less involved recording session isbetter.

Recent development has indicated that an increase in glutamate causes symptoms similar to depres-sion. Glutamate is the main neurotranmisttor in the brain, making up about 80% of the brains electrical

7

Page 8: Johan Gustafsson - DiVA portal

1 2 3 4 5 6Days

2

10

100

Hz

CWT of EEG (Non-depressed model)

1 2 3 4 5 6Days

2

10

100

Hz

CWT of EEG (Depressed model)

Figure 2: Time-frequency plot of EEG data, darker shades represent more activity. The non-depressedmodel (top) clearly has a pattern of EEG activity that matches the daily rhythm, in contrast to thedepressed model (bottom) which has a less clear pattern.

activity. This theory is partly funded on the fact that ketamine, a glutamate receptor antagonist, hasa fast-acting anti-depressant effect. Thus, it should be possible to detect variations in glutamate levelsthrough an EEG [17]. This report could be used as a prestudy for a forward translational project tosegment depression based on EEG [18].

A related application of EEG for diagnosis is the commercial company Mentis Cura who claims toreliably detect Alzheimer’s disease in an early stage from a 5 minute EEG recording [19]. Their methodis based on statistical pattern recognition (SPR) on the EEG signals but it cannot alone account forcomorbidity [20].

1.2 Animal modelsThe Flinders sensitive line (FSL) is a rat strain that we will use as a model for depression. The FSL strainwas not intentially created as a model of depression. The original intent was to produce a strain thatwas resistant to a drug for blocking the effects of a specific enzyme (anticholinesterase). The selectivebreeding program did instead create progressively increased sensitivity to another toxin (DFP) [21]. TheFSL rat displays several traits that resemble depression in humans; they do not sleep well, they exploreless, they show increased despair, they are bad at learning and remembering, they are restless, andthey are bad at controlling stress compared to standard laboratory rats. In addition to this they reactpositively to anti-depressant medication [22].

Sprague-Dawley (SD) is rat strain commonly used as multipurpose standard laboratory rat in medicalresearch. It is an albino (white fur) rat characterized by being calm and easy to handle. The same strainhas been used since 1925.

1.3 Interactive time-frequency analysis - FreqThe approach taken in this study builds on studying the large datasets with software for visually assessingfeatures of interest in the signal and then trying to develop algorithms that capture said features anddescribe it in a statistical manner. The human eye is good at assessing patterns in images. The software

8

Page 9: Johan Gustafsson - DiVA portal

Freq leverage this ability by making it easy to switch between different representations of the data,allowing the detection of different features in different representations and different perspectives. Beingable to compute the spectrogram of a signal with 4¨108 samples, zoom into details, change the perspectiveto look at variations over time in specific bands or which frequencies that might correlate, and interactwith the time-frequency resolution was essential to develop the features of interest presented in thisstudy. This would not be possible with Matlab plots alone as there is not enough memory available tocompute a full continuous wavelet transform with 40 scales per octave of a signal with 400 million samples(uncompressed this would require a 500 GB image). And even if there was enough memory it wouldnot be possible to navigate it interactively due to the sheer size of the image matrix. Freq solves thisby quickly computing the image iteratively as one interact with the data while providing prompt accessto study the corresponding waveform of an event and filtering out frequencies of no interest. Figure 2shows such a rendering of a continuous wavelet transform.

While Freq takes a manual approach to data analysis it was used in conjunction with Matlab todevelop numeric parameters that can be evaluated for each of the features of interest. The scriptscomputing parameter estimates could then be validated by plotting the values on top of the images inFreq to see if they match the predictions of the human eye, and similarly to aid in understanding whenand what the parameters and classifiers actually detected.

1.4 PurposeThe hypothesis here is that a simple cheap EEG recording over a small period of time can be used todetect signs of depression. More specifically;

1. This study will develop one or more classifiers that can separate the strains Sprague Dawley fromFlinders sensitive line and find out how long recordings that are necessary for the classifier to givecorrect predictions. Each strain is inbred and the individuals within the group should thus to behomogenuous. From this it is assumed that there is no variation in how severe the depression iswithin each strain.

2. The classifer will then be applied to EEG data recorded after Ketamine treatment to evaluateif the prediction of the classifier is affected by Ketamine treatment. Ketamine is a proposedanti-depressant so if the classifier weighs more towards the non-depressed model after Ketaminetreatment it is a sign that Ketamine works as an anti-depressant. Ketamine has shown promisingeffects in humans were patients themselves report a rapid anti-depressant effect, but the effects alsodecay rapidly. The approach proposed here will study what happens to the EEG after a Ketaminetreatment and compare it to the baseline of a non-depressed EEG.

2 MethodSeveral parameters of the recorded EEG data were studied to see how well they describe a separationbetween the strains, how long recordings that are required to observe a significant difference, and if anydifference is affected by Ketamine administration.

We assume that each strain is homogeneous and will not focus on individual differences between ratswithin the same strain. The Flinders sensitive line (FSL) rats are inbred which makes their DNA veryhomogenous, and we assume that their expressed traits therefore also are sufficiently homogenous.

This study has studied two strains of rats, depressive and non-depressive, to identify signal processingclassifiers that separate the two.

The result of these various preprocessing steps were then used to evaluate the parameters for buildingstrain classifiers which were then used to evaluate treatment effects of ketamine.

The signal will be separated into segments representing different types of activity so that it is possibleto study how features vary during different states.

2.1 Data materialThe EEG recordings used in this study were single channel recordings between two epidural electrodes,that is, the electrodes were placed on the dura mater above parietal and the prefrontal cortex. The signal

9

Page 10: Johan Gustafsson - DiVA portal

Hz1 10 100

dB

-30

-20

-10

0

Signal strength

EEG power (SD)

EEG power (FSL)

Filter

Hz1 10 100

dB

-30

-20

-10

0

EEG power scaling by 1/fn

Filter-1

× EEG

Scaling by 1/f0.42

Scaling by 1/f1.13

Figure 3: Power spectral density in baseline recordings, the bottom plot illustrates that the EEG powerstrongly resembles the amplitude response of the recording device. Note that while the signal strength at100 Hz is an order of magnitude weaker there is still significant signal content remaining after filteringby the recording device. The spectral power was estimated by taking the median of the square of eachfrequency bin in a spectrogram produced with a Hamming window with a length of 8192 samples and 50%overlap (window lengths of 256 or 65536 produced virtually identical PSDs). The left plot illustrates thatthe common mean of both strains is a good estimator of the mean of the separate strains.

from the electrodes were then wirelessly transmitted with a telemetry system to the recording computerwhere the signal was stored.

The healthy strain was represented by 8 animals from Sprague dawley (SD), and the depressed strainwas represented by 8 animals from FSL. The 16 recordings span 9.2 days with a gap of missing data fromday number 7. The telemetry system used was a wireless transmitter called F40-EET by Data SciencesInternational (DSI). The wireless transmitter applied a bandpass filter from 0.1 Hz to 100 Hz beforedigitizing and then encoded the signal with a varying sample rate around 250 Hz [23]. The recordingcomputer then resamples this data to a pulse code modulation with a sample rate of 500 Hz. Some datais missing due to amplitude reaching out-of-bounds values (from artefacts) or transmission errors. Therats were freely moving with cage mates in their home cages during the recordings.

The prior project used two different levels of Ketamine treatment with 10 mg/kg and 30 mg/kg,compared to a vehicle of 0 mg/kg. The doses were randomly administered and each animal had a dose inthe beginning of day 3 and day 6, counting from the beginning of the experiment. Each day starts with12 hours of lights on to induce a subjective night followed by 12 hours of ligths out inducing a subjectiveday (the rats are nocturnal, awake when it’s dark and asleep when it’s light).

While the recording device has a specified bandwidth of 1-50 Hz [23] it delivers signal content upto 140 Hz. However, the varying sample rate of the F40-EET is at most 250 Hz and mostly closer to200 Hz [24]. This means that signal content above 125 Hz should be purely resampling artefacts, andsimilar artefacts from temporal aliasing are probably present below 125 Hz as well. Figure 3 show apower spectrum of the recorded data. The power of electrical activity in a brain scales with 1{fn, wheref is the frequency and n P r1, 2s whose exact value depends on various factors [10].

When the signal is normalized by scaling with fn and compared to the amplitude response of therecording device it is seen that frequencies above 50 Hz are attenuated as expected by the recordingdevice which means that the power of the original signal in the brain follows the same curve 1{fn toat least 100 Hz. Low frequencies below 2 Hz also follow the expected fn curve whereas the theta band(4-10 Hz) is comparatively stronger than the other bands with a peak at 6-7 Hz.

The power of different frequencies in the EEG recording (that is, the square of the Fourier amplitude)is inversely related to temporal frequency f . In these recordings power below 7 Hz followed an estimatedscaling by 1{f0.55 and power above 10 Hz followed an estimated scaling by 1{f1.20. See figure 3. TheAppendix contains listings of detailed time-frequency plots of the studied data.

10

Page 11: Johan Gustafsson - DiVA portal

2.2 Notations and common equationsA vector of recorded single-channel EEG data is denoted y “ tyiu where yi P <. Such a vector may spana short segment of a longer recording, the whole week or a concatenation of several shorter segments.The different recordings used are denoted Ri and the complete dataset from one recording is denotedyRi. Matrices and sets are denoted with capital letters X. A subset function y “ ΓpyRiq produces asmaller vector by selecting one or multiple concatenated segments from the full recording, a commonlyused subset function is the baseline function Γbaseline that selects the full second 24 hour day of therecording. Other examples include functions that only select sleep or awake segments. EEG parameterfunctions f “ fpyq “ tfipyqu extract some real vector-valued parameter from an EEG segment. Aparameter may also be a scalar f “ fpyq.

The sum over j in integer steps from a to b is denotedj“bř

j“a

yi,j orb´ař

j“a

yi,j , or justř

j

yi,j if j has a

natural range such as the number of elements in y. Or justř

yi if the index is i and no other variable isbeing summed over. The mean value of a vector y is denoted "meani yi" and typically the character µ isused to represent a mean value. Similarily the standard deviation is denoted "stdi yi"" and representedby the character σ. The median is denoted "mediani yi". "madi yi" is the median absolute deviationand equation references look like this: (1).

madi yi “ mediani pyi ´medianj yjq (1)

Regression quality on scalar response: R2 and R2

Three measures that will be used repeatedly to evaluate the quality of regressions are R2, R2 andκ. R2 can be interpreted as how much of the variance in a response variable that is explained byexplanatory variables. In a linear regression β “ arg minβ ||Xβ ´ Y || the response variable is Y , theexplanatory variables are X, β is the parameter estimates and ||y|| is some norm of y (for the Euclideannorm (or L2, or ||y||2) this regression is a least squares fit). The same concept of response variables,explanatory variables and parameter estimates applies to generalized linear regressions such as a binomialor multinomial regression (or arbitrary regressions for that matter) although the interactions between thevariables may be more complex. R2 (adjusted R2) takes into account that more explanatory variables(more columns in X) should always explain more of the variance in the response variable Y .

yi “ response variable

yi “ predicted response variable

p “ number of predictor variables

y “ meantyiu

R2 ” 1´

ř

pyi ´ yiq2

ř

pyi ´ yq2(2)

R2 “ 1´ p1´R2qny ´ 1

ny ´ p´ 1

“ R2 ´ p1´R2qp

ny ´ p´ 1(3)

Binomial and multinomial regression

Binomial regression is a generalized linear model for classifying data into one of two groups based ontheir distance to a hyperplane. The hyperplane is defined by a scalar product between a β vector andexplanatory variables βx (where x is assumed to include an intercept which means that one of the valuesin x is 1 so that β can capture the mean). The hyperplane is found by maximizing the separation ina dataset with known classes and then subsequently used when classifying samples of unknown class.The result is a probability for each possible class, i.e a value in the range r0, 1s. Binomial regression isused for two classes and multinomial regression is a generalization for more than two classes which uses

11

Page 12: Johan Gustafsson - DiVA portal

the distances to multiple hyperplanes βk, one for each class k P K. Based on the hyperplane distancesδk “ βkx multinomial regression calculate probabilites for each class using the softmax function (4).

softmaxpk, δq “eδk

řKi eδi

(4)

The softmax function is approximately 1 when k is the largest element in δ and 0 otherwise butmakes a smooth transition between classes. The multinomial (or binomial) probability of class k is thuscalculated by (5).

Pr`

Y “ k | x˘

“eβkx

řKi eβix

(5)

Note that the sum of probabilities of all classes is 1 (6).

ÿ

kPK

Pr`

Y “ k | x˘

“ 1 (6)

The MATLAB function mnrfit with default arguments were used to calculate β given a matrixexplanatory variables X and known classes Y .

Regression quality on nominal response: Cohen’s κ and adjusted Cohen’s κ

Cohen’s κ (kappa) is a measure of agreement between two nominal classifications (also called interex-pert agreement). Such as classifying according to the most likely response in a binominal/multinomialregression. It takes into account the amount of identical classifications Prpaq, as well as the possibilityof agreement by chance Prpeq. The two classifications C1 and C2 are typically a true response variableY and an estimated response variable Y .

κpC1, C2q “Prpaq ´ Prpeq

1´ Prpeq

Prpaq “ PrpC1 “ C2q

Prpeq “ÿ

cPC1YC2

PrpC1 “ cq ¨ PrpC2 “ cq

(7)

Where C1 Y C2 refers to all possible classifications in C1 or C2. Adjusted cohens kappa κ (8) is anovel measure that applies the idea of adjusted R2 to Cohen’s kappa to account for overfitting with toomany explanatory variables p compared to the number of samples ny.

κpC1, C2q “ 1´ p1´ κ2pC1, C2qqny ´ 1

ny ´ p´ 1(8)

2.3 WorkflowThe software Freq, developed by the author, was used to navigate and visualize the signals and thesoftware Matlab was used for developing signal processing algorithms.

The named frequency bands listed in table 3 were not assumed in the methods employed in this study.Rather the spectrograms were analyzed with signal decomposition techniques such as non-negative matrixfactorization (NMF), principal component analysis (PCA) and independent component analysis (ICA)to find prominent frequency bands of that describe the EEG recording. The named frequency bands areused though when referring to frequency ranges in the discussions of figures.

The EEG data was first preprocessed to produce segments and data representations that were easierto work with (section 2.4). Candidates for the various parameters were primarily identified throughvisual inspection of time-frequency plots of the signals using Freq.

12

Page 13: Johan Gustafsson - DiVA portal

Each parameter is defined by a vector-valued function detailed in section 2.5. The parameters werethen used as strain classifiers and their qualities as such were evaluated (section 2.6). Finally the effectsof Ketamine treatment was evaluated by studing the effects on the strain classifiers (section 2.7). Theworkflow of the method used in this study is summarized in figure 4.

Preprocessing

EEG recording yRi

Artefact removal

Power normalization

Loglognormal spectrogram

Spectrogram decomposition

Sleep scoring

Subset segmentation

EEG parameters

EEG recording yRi

Features of interest

Parameters fpyq

Strain classification

Parameter decomposition

Strain classifier

Classifier quality Qp, Qκ

Ketamine effects

Figure 4: Workflow for building classifiers, assessing their quality and examine Ketamine effects. Eachbox is described in sebsequent sections below.

2.3.1 Data management

When studying the signal in MATLAB the data had to be analyzed in smaller subsets. Both in order torun small iterations fast during development, but also to not exhaust the available working memory ofthe computer. Intermediate copies kept by by algorithms (even built-in MATLAB functions) can easilyrequire an order of 10 times the size of the input, or more if the complexity of the algorithm is non-linear.During development it was of interest to store intermediate results if they took a long time to compute,but this was not feasible to do for more than a few copies of results of the whole dataset. Each of thesteps outlined in figure 4 rhoughly corresponds to such intermediate results.

An important side effect of this project were the improvements to Freq for keeping its memory foot-print low even when studying such large datasets. Instead of keeping the whole file unpacked in memoryFreq only loads comparatively small chunks of the source data when computing the corresponding sec-tion of the time-frequency plots. The plot was then resampled to viewing resolution where a pixel valuewas aggregated from a whole area of the actual time-frequency plot by using a max filter. Overall thisprovided a nice tradeoff between caching and memory footprint that ended up being faster at churningthrough data than the previous version of Freq due to improved memory access patterns, so that theoverhead of reloading data from disk were hidden by being parallelized. It was effective enough thatthese datasets could be studied interactively through Freq on a tablet (iPad 2 Mini). Through Freq itwas possible to study the whole dataset using both its waveform trace, spectrograms and continuouswavelet transforms when looking for patterns of interest.

13

Page 14: Johan Gustafsson - DiVA portal

Time [s]-10 -5 0 5 10

Sig

nal [m

V]

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8Artefact: constant value

input

constant value

result

Time [s]-10 -5 0 5 10

Sig

na

l [m

V]

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8Artefact: high amplitude

input

high amplitude

result

threshold

Time [s]-10 -5 0 5 10

Sig

nal [m

V]

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8Indirect artefact: short segment

input

short segment

result

Time [h]0 night 12 day 24

Art

efa

cts

[%

]

×1010

0

1

2

3

4

5

6

7

8

9Artefacts

SD

FSL

Figure 5: Cleaning input data from 3 different types of artefacts and removing transient effects fromedges to removed data. The last plot shows the amount of samples in artefacts per hour during the 24hbaseline in all 16 animals.

2.4 PreprocessingThe raw data contained artefacts that interferred with the measurements. The different recordings werescaled differently. The spectrogram was decomposed into simpler (fewer) frequency bins, similar to thenamed frequency ranges listed in the Introduction.

2.4.1 Artefact removal

To avoid false detections originating in non-neurological sources the data was cleaned from artefactsbefore analysis. There are multiple potential sources for artefacts in a EEG recording [25]. In EEGrecordings from humans a common source is eye movement or blinking (ocular artifacts) and other facialmuscles that interfere with EEG electrodes placed on the front of the skull. Other muscles such asneck muscles may also interfere. It was previously believed that electric signals from muscular activity(electromyography (EMG)) was confined to high frequencies (above 20 or 40 Hz) but it has since beenshown that such signals have a much wider frequency range from 0 Hz to more than 200 Hz. Yet anothersource of artefacts may be interference from electronic equipment, as for example the 50 Hz or 60 Hzpower supply signals.

14

Page 15: Johan Gustafsson - DiVA portal

In these recordings the artefacts should be limitted to movement artefacts and transmission artefacts.Movement artefacts occur when movements interfere with the wire from the electrode to the transmitterin the abdominal space. Transmission artefacts occur when the rat is in a position where the signal tothe receiver in the bottom of the cage is to weak to be decoted. The transmitter sens the signal througha wireless signal encoded in a digital PWM signal which if obstructed by electromagnetic interferencewould yield a clearly obstructed output signal.

Streaks of constant values were removed. Some segments of the data contained long streaks of constantvalues (not necessarily only zeros) which are unnatural and is assumed to be a result of a transmissionerror. When more than 7 values in a row had the exact same numerical value (to available numericalprecision) the entire segment of constants was discarded as an artefact. It did appear that 5 equal valuesin a row could happen by chance (by manual inspection).

Extremely high amplitude was discarded as artefacts with a threshold |yi| ą 20median|y|. Themedian of the absolute value is a stable measure of scale, unsensitive to outliers. It equals the measuremad if medianpyq “ 0. Such large amplitudes may be a natural effect of an actual voltage gap betweenthe electrodes but it is probably not an EEG signal (a voltage gap originating from synchronously firingneurons).

Some data segments were also missing in the original recording, most likely due to movement duringawake that interferred with the transmission. Remaining data contained short bursts of data withinsegments of otherwise discarded data. Short segments are unreliable for analysis so segments shorterthan 20 seconds were also discarded as artefacts.

Transient edges to artefacts were smoothed with a sinusoidal ramp-in and ramp-out window over thecourse of 10 seconds before and after a discarded segment. Artefacts were encoded as Not-A-Number justlike missing data so such segments may be seen in these figures as gaps in trace plots of the waveform ortime-frequency plots. Some examples from the artefact removal process of each artefact type is illustratedby figure 5. Note that there are more artefacts per hour during the subjective day. On average 2% of allsamples where discarded as artefacts. The EEG parameter Artefacts (section 2.5) examines the amountof artefacts in different subsets of the data.

2.4.2 Power normalization

EEG recordings from FSL rats had a significantly stronger power measured in dB than the recordingsfrom SD (p “ 0.012), see figure 6. The signal power would affect all parameters developed from thedata so each signal was normalized by its total median absolute deviation. The mean power (i.e thestandard deviation of the signal) was not used because the signal strength was not normally distributed(non-gaussian). This applies both to the signal as a whole as well as all frequency bands independently,see figure 7.

The effects of varying signal power was removed by normalizing the signal to mad “ 1.0, giving aresulting unit proportional to volt [„V]. This assumes that the overall power of a strain is not a biomarkerin itself but merely an artefact of the recording setup. Figure 6 shows violin plots with the estimateddistributions of the power and mean of the raw data before normalization. By all practical means themean is identical to 0.0 in all animals as the numerical precision of the recorded data is of the samemagnitude. The value closest to 0 except 0 in the data is approximately 3e´ 7 “ 3 ¨ 10´7.

Variations in head sizes may cause the corresponding recording to vary in not only amplitude but alsofrequency range as different frequencies are not attenuated the same way when propagating through thebrain and skull. This observation makes variables related to merely comparing frequency power betweenthe strains less reliable.

2.4.3 Loglognormal spectrogram

The loglognormal spectrogram Sm,i is the logarithm of a power spectrogram resampled to a logarithmicfrequency range. It is normalized for each frequency bin m with an offset and scale calculated from theloglog spectrogram of all concatenated baseline recordings.

15

Page 16: Johan Gustafsson - DiVA portal

SD FSL

Am

pltid

ue

[u

V]

-7e-07

-6e-07

-5e-07

-4e-07

-3e-07

-2e-07

-1e-07

0

1e-07

2e-07EEG mean, p=0.7

Mean, µ

Median

SD FSL

Pow

er

[dB

]

-8

-6

-4

-2

0

2

4

6EEG power (dB), p=0.012

Mean, µ

Median

Figure 6: EEG signal mean voltage and standard deviation. Note that the difference between the weakestand strongest recording is about a factor of 10 (from -5 to +5 dB).

Power distribution

Fourier amplitude-2σ -σ µ σ 2σ

Hz

020406080

Fourier amplitude squared-2σ -σ µ σ 2σ

Hz

020406080

Log of Fourier amplitude-2σ -σ µ σ 2σ

Hz

020406080

Figure 7: Power histograms with the Fourier amplitude compared to the logarithm of the Fourier ampli-tude.

16

Page 17: Johan Gustafsson - DiVA portal

Fktxu “1

nx

nx´1ÿ

j“0

xj ¨ e´i2π k

nxj

fk “ k{fs

gm “ exp

ˆ

ln 0.4`ln 120´ ln 0.4

40´ 1pm´ 1q

˙

Km “

!

k : gm “ minm1|log fk ´ log gm1 |

)

K 1 “ tKm : |Km| ą 0u

Gmtxu “ meankPK1m|Fktxu|

2

Hipxq “ xihi

Ym,ityu “ 10 ln10 pGm tHpyiwo : iwo`wquq

Sm,ityu “Ym,ityu ´meani1Ym,i1ty1u

stdi1Ym,i1ty1u

(9)

Where Fktxu is the discrete fourier transform of x, fs is the sample rate of x, fk is the linearallydistributed frequencies of the fourier spectra (fk gives the linear center of each bin), gm is the loga-rithmically redistributed frequencies (gm gives the logaritmic center of each bin). Gm is the resampledfourier transform with squared amplitudes. The resampling is done by taking the mean of a nearestmapping where Km denotes the set of fk bins that goes into a gm bin. H is a window function with awindow vector hi (omitted). Yk,ityu is the spectrogram of y and yiwo : iwo`w denotes the subset of thevector y spanning from element yiwo to (but not including) element yiwo`w. y1 is a training baselinesegment, concatenated from stratified samples of all baseline recordings.

The logaritmic frequency axis ends up with 22 bins from the 61 bins in the linear frequency axisbetween 1 Hz and 120 Hz based on a window of length 256 samples and a sample rate of 500 Hz.

2.4.4 Spectrogram decomposition

By visual inspection of a spectrogram visualization it becomes clear that the timevarying spectral contentof this dataset mostly falls into a limited set of components with few intermediate situations (see Figure 2).Either high frequencies are higher than the average (as in awake and REM sleep) or low frequencies arehigher than the average (as during non-REM sleep) but it rarely happens for instance that both highand low frequencies are simultaneously higher than average, or are simultaneusly on the average level.The data thus seems to not be normally distributed. This is also supported by two separate peaks(roughly corresponding to awake/rem and nrem) on the the probability density estimate of the firstPCA component listed in figure 9. The independent clusters in the first (strongest) component meansthat a shift in measures such as mean or std should rather be interpreted as a weakening/growth of acluster as oposed to a shift in the numeric values. The segmentation based on sleep states does not havethis problem.

To separate the spectral content in different types of EEG activity we run an independent componentanalysis on the time-dependent spectral content of the EEG to score the signal into segments of differentactivity. As noted in the previous paragraph all frequency components of the signal were non-gaussianwhich is a requirement for ICA analysis, and an anti-requirement for finding good components through therelated principal component analysis. The signal has to be whitened first though, which is accomplishedby subtracting the median and dividing by the median absolute deviation, MAD[26] which is a robustmeasure of scale. Yielding a median=0 and MAD=1. This is analog to whitening a gaussian dataset bytransforming the data to a z-distribution (with mean=0 and standard deviation=1).

ICA assumes that components can be linearly combined which is at most approximately true, but itis close enough to be useful albeit not optimal [27].

In order to do an efficient decomposition the spectrogram data has to be normalized first. Thespectrogram this analysis is based on uses a hamming window with a width of 256 samples and 50%overlap. Natural events are typically logaritmically distributed so it makes sense to resample the linearly

17

Page 18: Johan Gustafsson - DiVA portal

distributed Fourier bins to a logarithmic distribution over Hz and express the power of all fourier ampli-tudes in dB which also has a logaritmic scale. This loglog spectrogram is then normalized to compensatefor the naturally large differences in strength between high and low frequencies. The normalization isperformed by subtracting the mean over all time steps and dividing by the corresponding standard de-viation on each frequency bin independently. The resulting spectrogram is then for no apparent reasonsmoothed with a very arbitrary box filter that probably does not help covering 4 frequency bins and200 time steps. This yields what is here refered to as the loglognormal spectrogram. The normalizationcoefficients (offset and scale for each bin) are saved so that the loglognormal spectrogram can then becomputed for an arbitrary segment of EEG data and yield comparable frequency information.

The decomposition performed here also acts as a filter that keeps prevalent properties but discardsrare events and noise. How much of the data that is kept after the decomposition can be measured bythe sum of eigenvalues after decomposition divided by the sum of eigenvalues before decomposition. Thiscoefficient is the same value as R2 to measure how much of a response variable that is explained by alinear regression. A lot of data is also discarded before that when resampling the linear frequency axisof the fourier transform to a logaritmic representation with lower resolution (22 bins compared to 256bins).

78% of the eigenvalues of the spectrogram are retained with just one PCA component. In addition toevaluate the PCA decomposition the spectrogram was also decomposed using independent componentanalysis (ICA) and non-negative matrix factorization. With fewer components the results tend to be lesssensitive to overfitting as well as easier to interpret.

NMF is a matrix decomposition approach which decomposes a non-negative matrix into two low-ranknon-negative matrices [1]. It has previously been successfully applied in the mining of biological data[28].

2.4.5 Sleep scoring

An experienced researcher performed a manual sleep scoring of the 24 hour baseline of the dataset bylooking at the waveform of EEG and EMG data where it is possible to recognize and distinguish theshapes of the EEG during deep sleep, rapid eye-movement (REM) sleep and wake activity. This datasetis used as a training dataset for providing spectrogram based sleep scoring for the remaining data.

A period of an EEG recording can be largely divided into one of two states; sleep and awake. Thepatterns between awake and sleep are rather different. The classification can be done by a computeralgorithm, or manually. In this case the dataset included a manual sleep scoring of a baseline recording.Multinomial regression was used with the manual sleep scoring as a training dataset.

The sleep was categorized into REM (rapid eye movement) sleep and non-REM sleep as the EEGsignal differs significantly between those two states. REM sleep is in some aspects more akin to awakethan non-REM sleep although REM sleep only happes after being asleep in non-REM sleep for a while.

Figure 2 shows the first 6 consecutive days of two rats to illustrate sleep patterns in a time-frequencyplot. The time-frequency plot is a continuous morlet wavelet transform with 40 scales per octave. Each24 hour day starts with a subjective night as the light is turned on. Theta activity is stronger duringnon-REM sleep [25] and the periods that are stronger in low frequencies show up as dark patterns inthe image. It is possible to count the days as 6 periods in the non-depressed models whereas it ishard to do the same on the depressed model. The differences between the images reflect different sleeppatterns as the daily rhythm is different between the strains. The non-depressed model to the left hasa regular sleep pattern whereas the depressed model on the right has a more irregular sleep pattern.This example is illustrative but not representative as there are non-depressed models with less regularrhythms and depressed models with less irregular rhythms. Corresponding images from all 16 rats arelisted in Appendix A.

The manual sleep scoring used as a target response variable was done on a resolution of 10 secondepochs. The spectrogram based sleep scoring was performed through multinomial regression from thequartiles and median of each frequency bin corresponding to each sleep state in the loglognormal spec-trogram. The MATLAB functions MNRFIT and MNRVAL were used to build the regression β. Thespectrogram used had a window length of 256 samples, which corresponds to about 0.5 seconds. Themultinomial state scores was then filtered by smoothing with a box filter, weak scores were discardedand the resulting classification was then taken according to the highest multinomial score, or from thenearest timestep that was not discarded. The coefficients of the smoothing window length as well as

18

Page 19: Johan Gustafsson - DiVA portal

the thresholds for discarding samples were optimized for producing a high Cohen’s kappa on half of therecordings for which the manual sleep scoring had been performed. The result section lists the coefficientsand the value of Cohen’s kappa on the reamining recordings used as a validation dataset.

The spectrogram based sleep scoring was then used to perform segmentation of the dataset and buildsubsets for further analysis by concatenating all samples classified as the same sleep state. The functionisawakepyiq P t0, 1u is used in the EEG parameters below and takes the value 1 of the correspondingspectrogram segment around sample yi was scored as awake, and takes the value 0 otherwise. Converselyisasleeppyiq “ 1 ´ isawakepyiq, the non-REM and REM states are not separated by these functionsas they are both non-awake states. sleep scorejpyiq P r0, 1s is the multinomial score for state j Ptawake,non-rem,remu after smoothing but before discarding scores below the threshold.

2.4.6 Subset segmentation

Table 1: Subsets used for evaluating variables and comparing treatment effects.

Name Description24h This baseline started 22 hours after the last human contact

when the recording was started.night The first 12 hours of 24h. night refers to the subjective

night and since rats are nocturnal animals this is the pe-riod where the lights are turned on (when the animals aretypically asleep).

day awake Awake during the last 12 hours of 24h. The subjective dayis defined by the period where the lights were turned off.

ketamine night 1 Night starting 2 hours before Ketamine administration.ketamine day 1 awake Awake during the day starting 10 hours after Ketamine

administration.ketamine night 3 Night starting 46 hours after Ketamine administration.ketamine day 3 awake Awake during the day starting 58 hours after Ketamine

administration.

In order to evaluate variables during sleep only or awake only the data was categorized, or scored, intosegments of sleep and non-sleep. The set of signals from all animals was divided into shorter subsets tostudy how variables differ between strains and over time. The subsets used are listed in table 1.

2.5 EEG parametersPreviously known biomarkers for clinical depression such as changes in alpha power and irregular sleeppatterns were verified to validate if the FSL rats seems to function as a model for clinical depression.This is important to verify as there is then is a higher chance that the results will be translational, i.ethat similar or identical results would potentially be found in both humans and rats. And successfultreatment of rats that eliminate biomarkers for clinical depression might also have a higher chance ofbeing successful on humans.

The potential features of interest were mostly found through interactive visualization in Freq. Pa-rameters are listed in no particular order. ny refers to the number of elements in the vector y. i “ x hrefers to the index in y at x hours into the vector. So for a samplerate fs the element yx h this refer toelement number i “ 3600xfs.

Artefacts

A trait of depression is being less active. As movement may cause artefacts in the recording measuringthe amount of artefacts can be used as a coarse proxy of measuring movement and thus measuringdifferences between the strains. Figure 5 illustrates some examples of segments detected as artefacts andfigure 8 illustrates that the amount of artefacts vary between the animals. As artefacts are encoded as

19

Page 20: Johan Gustafsson - DiVA portal

SD FSL

Art

efa

cts

[%

]

0

2

4

6

8

10

12

Artefacts day, p=0.6

Mean, µ

Median

SD FSL

Art

efa

cts

[%

]

0

2

4

6

8

10

12

Artefacts night, p=0.3

Mean, µ

Median

Figure 8: Amount of data discarded as artefacts during day and night respectively.

NaN -values by the preprocessing step this parameter can just check for the occurance of NaN -valuesin a segment y.

fpyq “1

ny

ÿ

i

isnanpyiq (10)

Chronotype awake

At which time of the day they are most likely to be awake. This is computed for the 24 hour baseline asthe weighted mean of all awake periods centered around noon during the subjective day.

fpyq “1

ny

i“30 hÿ

i“6 h

i ¨ isawakepymodpi,24 hqq (11)

Chronotype for the day and night subsets does not have the offset:

fpyq “1

ny

i“12 hÿ

i“0 h

i ¨ isawakepyiq (12)

This parameter does not have a shortest significant length so the quality measures does not apply.

Sleep

The sleep parameter measures the total amount of sleep in a segment y, measured as a fraction of thesegment length.

fpyq “1

ny

ÿ

isasleeppyiq (13)

Sleep slope

The sleep slope parameter measures how the sleep scoring varies across a subset segment y. It performsa linear regression onto the sleep scoring.

20

Page 21: Johan Gustafsson - DiVA portal

taj , bju “ arg mina1j ,b

1j

ÿ

i

`

a1ji` b1j ´ sleep scorejpyiq

˘2

fpyq “ taawake, bawake, anon-rem, bnon-rem, arem, bremu

(14)

This parameter only makes sense to compute for long segments, such as a whole day, a whole night, orthe entire 24 hour baseline.

Subset segment length

A subset of only awake (or only sleep) consists of many concatenated shorter segments. This parameteris used to studiy if the typical length of these segments vary between strains. Each element yi in thevector comes from one such concatenated segment and spyiq gives the length of that segment. fpyq givesthe average segment length.

fpyq “1

ny

ÿ

spyiq (15)

Subset length

This parameter is used to study if the length of a subset vary between strains.

fpyq “ ny (16)

This parameter only makes sense to compute for a 24 hour period.

MAD

The total signal is normalized to have RMS-power equal to 1 but a segment or a subset may have adifferent MAD-scale. MAD (median absolute deviation) is a robust measure of scale that is unaffectedby outliers. This parameter is used to study if the signal has a different scale in subsets from differentstrains.

fpyq “ mediani |yi ´medianj yj | (17)

RMS

The total signal is normalized to have RMS-power equal to 1 but a segment or a subset may have adifferent RMS-power. This parameter is used to study if the power of different subsets vary betweenstrains.

fpyq “

d

1

ny ´ 1

ÿ

y2i (18)

Power spectrum

This parameter describes the mean of squared fourier amplitudes resampled to a logaritmic frequencydistribution fm with 40 frequency bins between 0.4 Hz and 120 Hz. The fourier amplitues are calculatedfrom a spectrogram with a Hamming window of length of w “ 256 and o “ 50% overlap.

21

Page 22: Johan Gustafsson - DiVA portal

Fktxu “1

nx

nx´1ÿ

j“0

xj ¨ e´i2π k

nxj

fk “ k{fs

gm “ exp

ˆ

logp0.4q `logp120q ´ logp0.4q

40´ 1pm´ 1q

˙

Km “

!

k : gm “ minm1|log fk ´ log gm1 |

)

K 1 “ tKm : |Km| ą 0u

Gmtxu “ meankPK1m|Fktxu|

2

Hipxq “ xihi

Ym,ityu “ Gm tHpyiwo : iwo`wqu

f1mpyq “ meani Ym,ityu

f2mpyq “ stdi Ym,ityu

fpyq “ rf1f2s

(19)

Where Fktxu is the discrete fourier transform of x, fs is the sample rate of x, fk is the linearallydistributed frequencies of the fourier spectra (fk gives the linear center of each bin), gm is the loga-rithmically redistributed frequencies (gm gives the logaritmic center of each bin). Gm is the resampledfourier transform with squared amplitudes. The resampling is done by taking the mean of a nearestmapping where Km denotes the set of fk bins that goes into a gm bin. H is a window function with awindow vector hi (omitted). Yk,ityu is the spectrogram of y and yiwo : iwo`w denotes the subset of thevector y spanning from element yiwo to (but not including) element yiwo`w. f2 captures the variabilityof the power spectrum and is used to take heteroscedasticity into account to see if it can be used as anexplanatory variable.

Spectrum dB

This parameter describes the power spectrum in a logaritmic scale. The logaritmic scale is likely to be abetter explanatory variable as natural events (such as an EEG) are typically logaritmically distributed,both in power and frequency. The binomial regression 2.6.2 applies an affine transformation to theparameter values but that will only be efficient if the parameter values are approximately normallydistributed. The logaritm of the power is more normally distributed than the power measured in squaredfourier amplitudes, see figure 7.

gmpyq “ Parameter: power spectrum fmpyq

fspectrum dbpyq “ 10 log10 gmpyq(20)

Alpha power

Others have reported a variation during clinical depression in the alpha power between the left and rightprefrontal cortex. This experiment only used one electrode on the prefrontal cortex so it was not able todetect any shift in alpha power between the left and right side. The second best thing to examine is tosee if any variation in power shows up at all when comparing the strains.

This parameter catches the mean value of the alpha band in the power spectrogram.

gmpyq “ Parameter: power spectrum fmpyq

α “ tm : 8 ă gm ă 12u

fpyq “ÿ

mPα

gmpyq(21)

22

Page 23: Johan Gustafsson - DiVA portal

2.6 Strain classificationEach animal ri has a baseline day

Y 24H “ Y 24Hpriq (22)

A subset Y of the baseline day Y “ SpY 24Hq Ď Y 24H were used to evalute the parameter duringcombinations of non-sleep, REM-sleep and NREM-sleep during subjective day, subjective night and entire24 hour baseline. The most interesting baseline subset is non-sleep during subjective day as it matchesthe project goal, but some of the other subsets of the baseline were also evaluated for reference. Theconstruction of baseline subsets is described in detail in section 2.4.5. The variabels were then evaluatedon one or more segments y Ď Y . A vector-valued parameter f with M elements can be evaluated for anarbitrary segment of EEG data expressed as a vector y as denoted by equation 23.

The mean spectrum of a segment y for instance yields row vector describing the power ofM frequencybands. For scalar parameters M “ 1, such as the average amount of sleep in minutes per hour. Eachparameter can be evaluated on a segment yri from each animal ri yielding matrix X with one row fromeach animal.

Xi,j “ fjpyriq (23)

X is a N ˆM -matrix for N animals in the dataset and M values in the parameter f . X thus dependson which subset that is used when evaluating f .

2.6.1 Parameter decomposition

If the classifier is built on too many variables M compared to the number of samples N it will overfit.A dimensionality reduction through singular value decomposition (SVD) was therefore performed. De-composition through SVD is equivalent to PCA (Principal Component Analysis) if the mean is zero (asin a normalized dataset) but also more numerically stable than forming the covariance matrix. SVD wasused initially once on the entire baseline to find the mean µi and standard deviation σi of each variableand components V describing most of the variance in the dataset. This had no effect on the result whenX or Xg had a single variableM “ 1 and was thus performed anyways to build all classifiers through thesame procedure. To prepare the matrix for decomposition the matrix was first normalized by removingthe mean and scaling by the standard deviation of each variable j in the parameter f .

µj “1

N

ÿ

Xi,j

σj “1

N ´ 1

ÿ

pXi,j ´ µjq2

Zi,j “ pXi,j ´ µjq{σj

(24)

Where Z is a normalized representation of X with normalization coefficients estimated from the baseline.The SVD theorem for real matrices states that there exist a factorization of a matrix on the form

Z “ UζV T , (25)

where U is an N ˆN unitary matrix, ζ is a N ˆM diagonal matrix with non-negative real numbers onthe diagonal, and V T denotes the transpose of the M ˆM unitary matrix V . Such a factorization iscalled a singular value decomposition of Z. The diagonal elements of ζ represents the singular values ofZ, or similarily the square root of the eigenvalues of ZZT . Furthermore the order of the columns and

23

Page 24: Johan Gustafsson - DiVA portal

rows are typically choosen such that the absolute values of the diagonal of ζ are in decreasing order.The rows of V T represent a basis for the rows in Z where the base vectors are ordered by how much ofthe variance in Z they describe. Since V is unitary V TV “ V ´1V “ I. By transforming the data tocoefficients C expressed in this basis the data can be reduced to fewer dimensions while keeping most ofthe variance, see equation 26.

Z “ UζV T

C “ Uζ

Z “ CV T

ZV “ CV TV

ZV “ C

(26)

The rows of V are reduced to the p first rows, where k “ min`

rankpZq, rN{4s˘

which gives at most 4predictor variables for the N “ 16 animals used. It is worth noting that for M “ 1 the SVD gives V “ 1and thus C “ Z so there is no need for a special case to disable dimensionality reduction for parameterswith only a single variable. Another way of explaining how much of variance that is retained after adimensionality reduction is to compare it to a linear regression where C is the explanatory variables andZ is the response variables. The coefficient of determination R2 for such a regression when the mean ofZ is a zero-vector can be obtained from the singular values ζi,i as given by equation (27).

R2 “

i“1

ζi,iř

i

ζi,i(27)

This shows why it makes sense to talk about how much of the variance that is retained in percent thatequals how much of the singular that are reatined.

2.6.2 Strain classifier

Each data segment y comes from a single animal ri that belongs to one of two strains. The strain isgiven by S in equation 28.

Spyriq P tsd, fslu (28)

The estimated strain S in equation 29 estimates the strain of an arbitrary data segment y for whichthe strain is unknown. It uses a strain classifier that yields a probability Pr for each strain. The strainclassifier is built using data in a baseline dataset.

Spyq “ maxs“tsd,fslu

Pr`

Spyq “ s˘

(29)

Each animal ri and corresponding row of the matrix C belongs to one of two strains described by thecorresponding element in tSiu. The matrix C is matched against the strain vector S to build classifiersfor the strains by using binomial regression. Ci is a row with all observed variables for the animal ri.The β parameter from the bionomial regression is used to compute a probability Pr for a given data rowc to belong to one of the two strains S “ sd or S “ fsl.

The classifier for a set of variables g is then described by tX0, V,βu. The probability that a segmentof a signal y belongs to a given strain is given by computing c from the signal y and using the probabilityfrom the binomial regression (5). A predicted strain is the strain with the highest probability in (30).

24

Page 25: Johan Gustafsson - DiVA portal

xj “ Fjpyq

xpyq “ txg1 , xg2 , ¨ ¨ ¨ u

cpyq “`

xpyq ´ X0

˘

V T

Pr`

Spyq “ sd˘

“ Pr`

S “ sd | cpyq˘

Spyq “ maxs“tsd,fslu

Pr`

Spyq “ s˘

(30)

2.6.3 Classifier quality

The quality of most parameters are also measured by how long segments that are required to obtain asignificant difference in the probability PrpS “ sdq between the strains with a p-value ă 0.05. Someparameters require a fixed length segment and are excluded from this evaluation. The p-value is estimatedfor a given parameter and segment length through a Monte Carlo algorithm that selects a random segmentwith the given segment length from each animal and computes a p-value describing how well separatedsd and fsl are. This is then repeated for a number of different segments to compute an average p-valuefor that segment length. Cohen’s Kappa κ (see equation 7) is similarily estimated by taking a numberof different segments y from all animals and comparing the result of the strain classifier S with the truestrains S as in equation 31.

S “ Spyq

κpS, Sq “Prpaq ´ Prpeq

1´ Prpeq

Prpaq “ PrpS “ Sq

Prpeq “ PrpS “ sdq ¨ PrpS “ sdq

` PrpS “ fslq ¨ PrpS “ fslq

(31)

For vector-valued parameters (such as a spectra that has one value for each frequency bin) the valueused in the p-test is the strain score from the binomial classifier.

Qκpf | κ ě γq is a quality measure telling how long segments that are needed to obtain a correctidentification of the strains with κ ě γ. A small value of Qκ mean that a short segment is enough toachive a correct classification most of the time. A higher γ means that more segments were correctlyclassified. γ “ 1 means that every single segment was correctly classified.

where Prpaq is the relative observed agreement among classifier and true value, Prpeq is the hypothet-ical probability of chance agreement, S “ Spyq is the estimated strain of the segment y and S “ Spyq isthe true strain.

Before testing a large number of segments the variable groups are first verified with a segment lengththat equals the entire baseline. If the variable works as a predictor (i.e p ă 0.05 or κ ą 0.75 respectively)the segment length is reduced by a factor of 2 and the number of segments to test is set to

P

4 ˚?

2nT

,where n is the number of reduction steps.

2.7 Ketamine effectsThe results of the method described above are a few different strain classifers, one for each parameter.When using any of the strain classifiers on new data the binomial coefficients yield a probability pertain-ing one group or the other. Changes in this probability after Ketamine treatment are then describing atreatment effect with respect to getting closer or further away from either model. If the ketamine admin-istration yield a significantly higher probability pertaining the non-depressed model it is an indicationof a possibly anti-depressive treatment effect. The significance was evaluated through regression withinteraction effects where the coefficients were demed significant if they were non-zero with p ă 0.05.

The probability pertaining the non-depressed model will be evaluated for different subsets by doing aregression on the strain probabilities before and after treatment and test for significance of the treatmenteffect. The slope of the corresponding regression coefficient gives the treatment effect with respect to

25

Page 26: Johan Gustafsson - DiVA portal

that parameter. Two subsets of each day were used for this purpose; awake during the subjective day,and the subjective night. Day 2 was regarded as the baseline towards significant effects were compared.

The experiment setup was factorial, meaning that each permutation of strain and ketamine admin-istration level is present in the dataset. The subset adds as an additional factor in the regression.

The data if first checked to see if the effects of ketamine are so short that any effect of the firstketamine administration seems to be gone before the second administration on day 6. If the effects arenot gone data after that administration is discarded. As only a few treatment combinations occur morethan once (if they occur at all) the dataset is too small to draw conclusions about interaction effectsfrom a mixture of treatments. The studied dataset from day 3 then consists of merely 2-3 animals perstrain (SD or FSL) and ketamine level (0, 10 or 30).

Administrations at day 3 and day 6 were not considered equivalent as there was a significant differencebetween day number 2, day number 5 and day number 8. That could otherwise have increased the datasetto 4-6 animals per strain and ketamine level.

If the non-depressed model is not affected by the treatment and the depressed model seems to showmore non-depressed values it is an indication of an possibly anti-depressive treatment effect. The effectsare summarized by a linear regression over variables with a significant correlation to the parameterestimate. But as the dataset was very small it is hard to talk about meaningful significance with merely2-3 animals per strain and ketamine level.

3 Results

3.1 Notation of resultsTo numerically summarize the difference in mean values between SD and FSL a linear model is usedwhere a value is expressed on the form µ ` ∆F . The notation 1 ` 2F means that the intercept has avalue of 1 and that the effect of the parameter F (for FSL) is 2. That is, in this case the mean of the SDgroup is 1 and the mean of the FSL group is 3 “ 1` 2.

Uncertainty in a measure is given by the standard deviation in paranthesis which applies to the leastsignificant figures of the number prior to the paranthesis. For instance x “ 0.123p45q means that themean value of x was estimated to 0.123 and that the sample standard deviation of the measured valueswas estimated to 0.045.

3.2 Decomposition

Table 2: ICA components were created by retaining 90% of the eigenvalues and reducing to independentcomponents. The components where created by decomposing the loglognormal spectrogram. The left tablelists how much of the variance in the loglognormal spectrogram that is retained by the decomposition mea-sured by R2. The right table lists how much of the variance that was retained in the original spectrogramafter resampling back to the linear scale.

R2 for loglognormal spectrogram# PCA ICA NMF1 22% 18% 17%2 32% 25% 39%3 39% 31% 46%4 45% 39% 51%5 49% 46% 56%6 54% 54% 61%7 58% 54% 64%

R2 for power spectrogram# PCA ICA NMF1 78% 78% 11%2 79% 79% 17%3 80% 79% 19%4 81% 80% 21%5 81% 81% 21%6 82% 82% 21%7 82% 82% 21%

PCA assumes that the data is normally distributed and creates components based on how differentvariables covary. ICA however, assumes that the data is not normally distributed and creates componentsthat separate independent clusters. NMF creates components that describe the data without usingnegative numbers which makes sense for describing data that cannot be negative. The spectrogram

26

Page 27: Johan Gustafsson - DiVA portal

PCA

PCA 1-4 -2 0 2 4

PC

A 2

-2

0

2

PCA

PCA 3-4 -2 0 2 4

PC

A 4

-2

0

2

10·|SD - FSL|

PCA 1-4 -2 0 2 4

PC

A 2

-2

0

2

10·|SD - FSL|

PCA 3-4 -2 0 2 4

PC

A 4

-2

0

2

ICA

ICA 1-4 -2 0 2 4

ICA

2

-2

0

2

ICA

ICA 3-4 -2 0 2 4

ICA

4

-2

0

2

10·|SD - FSL|

ICA 1-4 -2 0 2 4

ICA

2

-2

0

2

10·|SD - FSL|

ICA 3-4 -2 0 2 4

ICA

4

-2

0

2

Figure 9: These figures show the distribution of the spectrogram values as expressed by the first twocomponents in ICA and PCA respectively. The separation in the first component represents the separationbetween high gamma and high theta activity. This also shows that the number of ICA components weretoo high compared to the allowed dimensionality limited by PCA components (6 PCA components wereused to reduce the 4 ICA components). But the sleep scoring worked well with this setup.

Table 3: Named frequency bands. This report uses the definitions provided by Buzsaki. All frequncy bandshere are defined by an interval given in hertz.

Name Buzsaki [10] Dauwels [25] Vakalopoulos [29]delta 1.5´ 4 0.5´ 4theta 4´ 10 4´ 8 4´ 8alpha 8´ 10, 10´ 12 8´ 12beta 10´ 30 12´ 30 12´ 30gamma 30´ 80 30´ 100

power cannot be negative when measured in fourier amplitude squared and others have shown that NMFis useful for describing such components. NMF was the best decomposition for sleep scoring but itwas too slow to solve the constrained optimization problem of finding non-negative coefficients of thecomponents for every single timestep in the spectrogram. PCA was the next best but an abused ICA wasused instead since ICA was thought to be better. The normalization used for NMF skipped removingthe mean to keep the zero-level but does still normalize the the power and rescale the frequency axis toa logaritmic axis. Section 2.6.1 describes a PCA.

3.3 Sleep scoring

Table 4: Spectrogram based sleep scoring results.

Amount in Correct in Correct inSleep state manual scoring training dataset validation datasetAwake 47% 91% 90%Non-REM sleep 39% 96% 96%REM sleep 14% 64% 68%Cohen1s κ peq. 7q - 83% 82%

27

Page 28: Johan Gustafsson - DiVA portal

Hz10 100

Rela

tive p

ow

er

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.44 PCA components

PCA 1

PCA 2

PCA 3

PCA 4

Hz10 100

Rela

tive p

ow

er

-0.4

-0.2

0

0.2

0.4

0.6

0.84 ICA components

ICA 1

ICA 2

ICA 3

ICA 4

Hz10 100

Rela

tive p

ow

er

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.84 NNMF components

NNMF 1

NNMF 2

NNMF 3

NNMF 4

Hz10 100

Rela

tive p

ow

er

0.1

0.15

0.2

0.25

0.3

0.35

0.41 NNMF components

NNMF 1

Hz10 100

Rela

tive p

ow

er

0

0.1

0.2

0.3

0.4

0.5

0.6

0.72 NNMF components

NNMF 1

NNMF 2

Hz10 100

Rela

tive p

ow

er

0

0.1

0.2

0.3

0.4

0.5

0.6

0.73 NNMF components

NNMF 1

NNMF 2

NNMF 3

Hz10 100

Rela

tive p

ow

er

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.85 NNMF components

NNMF 1

NNMF 2

NNMF 3

NNMF 4

NNMF 5

Hz10 100

Rela

tive p

ow

er

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

16 NNMF components

NNMF 1

NNMF 2

NNMF 3

NNMF 4

NNMF 5

NNMF 6

Hz10 100

Rela

tive p

ow

er

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

17 NNMF components

NNMF 1

NNMF 2

NNMF 3

NNMF 4

NNMF 5

NNMF 6

NNMF 7

Figure 10: Principal component analysis (PCA), independent component analysis (ICA) and Non-negative matrix factorization (NMF) with 4 components of a normalized baseline spectrogram from allanimals. decomposition of a normalized baseline spectrogram from all animals. 4 components retain98.9% of the eigenvalues and 7 components retain 99.8%.

28

Page 29: Johan Gustafsson - DiVA portal

Hz10

010

110

2

dB

-5

-4

-3

-2

-1

0

1

2

3

4

5

Sleep state deviation from mean

Awake

Non-REM

REM

Hz10

010

110

2

dB

-8

-6

-4

-2

0

2

4

6

EEG power in different strains and states

SD awake

SD Non-REM

SD REM

FSL awake

FSL Non-REM

FSL REM

Figure 11: Deviations from mean during different sleep states in the baseline recordings. The left plotshows the mean of all baseline recordings in each each strain and state. The right plot shows all individualspectrums from each baseline recording to illustrate a sense of the spread in the dataset. The rightplot also illustrates that the power normalization that rescaled all frequencies equally did not introducecorresponding gaps in the frequency spectra between recordings. The average spectrum for with standarddeviation from 30 second segments.

Time [Hours]0 night 12 day 24

S

leep s

core

C

om

ponents

0%

50%

100%

Awake

Non-REM

REM

Awake

Non-REM

REM

R2. 91% correct

Manual

Automatic

Awake

Non-REM

REM

Time [Hours]0 night 12 day 24

S

leep s

core

C

om

ponents

0%

50%

100%

Awake

Non-REM

REM

Awake

Non-REM

REM

R12. 88% correct

Manual

Automatic

Awake

Non-REM

REM

Figure 12: Sleep scoring. The top row in gray illustrate the manual sleep scoring which goes into one of3 states. The next black row illustrates the results of the spectrogram based sleep scoring. The colouredlines illustrate the multinomial state scores for which the black row denotes the state with the highestscore (after thresholding). The left plot show one of the signals used as data for training the multinomialregression for sleep scoring. The right plot show how the sleep scoring was subsequently applied to one ofthe signals in the validation dataset.

29

Page 30: Johan Gustafsson - DiVA portal

Figure 11 shows how the different sleep states vary in spectral power distributions and table 4 liststhe quality of the resulting sleep scoring built on that. The smoothing window length was set to 60timesteps in the spectrogram, or 15 seconds. This should be compared to the manual scoring resolutionof 10 second epochs. The multinomial scoring thresholds were set to 0.6, 0.55 and 0.45 for awake, non-REM and REM respectively. Sleep scoring was also performed without doing a decomposition of thedataset where Cohen’s κ was 95% for the training dataset and 79% for the validation dataset. Thismeans that it was overfitted compared to the sleep scoring based on the decomposition scoring shown intable 4 where Cohen’s κ was approximately the same in the training dataset and the validation dataset(83% vs. 82%). Figure 12 illustrates the process of sleep scoring and serves to give an idea of how thedifferent sleep states were distributed throughout the day.

3.4 EEG parameters

Table 5: p-values and parameter estimates or classifier scores for the EEG parameters listed in section 2.5.A dash (—) represent that the parameter is not applicable to that subset. bold p-values are significantwith p ă 0.05. A value of p “ 0.000 is significant to 3 digits after the decimal point meaning p ă 0.0005.1 for vector-valued parameters the value in the linear model represents the classifier score from themultinomial regression.2 the parameter is evaluated on the EEG signal prior to preprocessing.

Baseline Night Day Day awake

Parameter value p value p value p value pArtefacts [%] 2.3 ´ 0.6F 0.49 1.1 ´ 0.5F 0.3 3.5 ´ 0.7F 0.62 4.4 ´ 0.6F 0.78Chronotype [h] 17.9 ` 0.9F 0.01 6.3 ´ 1.0F 0.01 18.1 ` 0.6F 0.002 —

Sleep [%] 50 ` 3F 0.02 71 ´ 4F 0.2 29 ` 11F 0.000 —[R2] 0.3 ` 0.2F 0.9 ` 0.0F 0.9 ´ 0.4F —

Sleep score slope1 0.3 ` 0.4F 0.03 0.25 ` 0.5F 0.003 0.23 ` 0.5F 0.001 0.37 ` 0.3F 0.027MAD2 [µV] 28 ` 6.3F 0.01 29 ` 7.2F 0.01 26 ` 5.7F 0.01 26 ` 5.7F 0.01RMS2 [µV] 56 ` 19F 0.01 56 ` 18F 0.006 56 ` 19F 0.08 56 ` 19F 0.08MAD [„V] 1.0 ` 0.0F 0.92 1.1 ` 0.0F 0.36 0.93 ´ 0.0F 0.54 0.83 ´ 0.06F 0.000RMS [„V] 1.7 ` 0.2F 0.000 1.9 ` 0.2F 0.000 1.6 ` 0.1F 0.006 1.3 ´ 0.1F 0.000Spectrum1 0.06 ` 0.9F 0.000 0.0 ` 1F 0.000 0.1 ` 0.8F 0.000 0.1 ` 0.8F 0.000Spectrum dB1 0.01 ` 0.99F 0.000 0.05 ` 0.89F 0.000 0.05 ` 0.90F 0.000 0.05 ` 0.77F 0.000Alpha power1 0.48 ` 0.04F 0.4 0.50 ` 0.0F 0.84 0.46 ` 0.08F 0.3 0.4 ` 0.2F 0.1

Table 5 lists the parameter estimates of the EEG parameters listed in section 2.5. There weresomewhat fewer artefacts in FSL than in SD on average but the difference was not significant whentaking into account the sample variance within each strain (the p-value over the entire baseline was0.49).

The chronotype parameter show a significant difference were FSL had a shifted daily rhythm withabout 1 hour compared to SD. FSL was more awake in the beginning of the night (mean 5.3 hours intothe night) whereas the short periods of awake in SD was equally spread out during the night with amean of 6.3 hours into the night. During the day the periods of awake was equally spread out in SDwith a mean around 18.1 hours from the day start (where 18 hours is noon) but the mean of FSL waslater around 18.7 hours with a significance of p “ 0.002. The baseline was evaluated for the chronotypeparameter by counting the first 6 hours as the last 6 hours of the next day, that is, the chronotype wasevaluated with a signal centered around noon.

Both SD and FSL slept in total about 50% of the 24 hour baseline (sleep [%] and [R2]). During thenight they slept about 70% of the time and during the day SD slept about 30(6)% of the day whereasFSL slept about 40(4)% of the day. There was however a significant difference in the stability of the sleeppatterns during the day, which showed on the entire baseline as well. During the day SD had a stablepattern of mostly awake but interrupted by several short sleep events. So the cummulative amount ofsleep during the day increased fairly linearly with an R2 “ 0.9. FSL on the other hand slept not onlymore but several sleep segments were long so the cummulative amount of sleep was not well explainedby a simple linear fit R2 “ 0.5.

The sleep score slope parameter describes that the activity causing sleep scoring varies between thestrains in all subsets. During the night REM sleep was scored on average as 0.41p7q in both SD and FSL

30

Page 31: Johan Gustafsson - DiVA portal

but a linear fit showed an increase between the beginning and end of the night of 0.146p28q{h in SD and0.092p52q{h in FSL. Non-REM sleep was similarly scored 0.41p4q in both SD and FSL but a linear fitshowed a decrease between the beginning and end of the night of 0.131p66q{h in SD and 0.011p70q{h inFSL. During day non-REM sleep was scored the equally high in the beginning of the day as in the end0.27p2q for SD whereas for FSL the linear fit showed that the end of the day was scored 0.08 lower thanthe beginning of the day, with the same mean as SD. This means that FSL slept more in the beginningof the day than in the end of the day whereas SD slept equally much in the beginning of the day as inthe end of the day.

The signal strength measured by both MAD and RMS before normalization (MAD 2, RMS2) wassignificantly stronger in FSL compared to SD during both night and day.

After normalization the signal strength was equal during both night and day measured in MAD [„V],but in RMS [„V] there was still a significant difference in strength. This means that the signals fromthe two strains come from two separate distributions (as the ratio between RMS and MAD is constantfor any given distribution. This could for instance be explained by a difference in the amount of timespent in the different sleep states.

The spectrum1 and Spectrum dB1 parameters are a more fine-grained comparsion of signal power.Since the overal power (RMS) was significantly different between the strains it is expected that thespectrums in total are also significant even though a given single frequency band is not significant (suchas alpha power). No significant difference in alpha power was found. This can be compared to figure 11where all recordings have approximately the same power around 10 Hz for a given sleep state. As opposedto the delta band (1.5´ 4 Hz) in which all FSL recordings are stronger than all SD recordings.

3.5 Strain classifiers

Table 6: Quality measures of variables where s, m, h stands for seconds, minutes and hours respectively.N, D and DS represent different subsets of the baseline, night, day and day awake respectively. Missingvalues represent that the quality condition was not fullfilled even for the entire subset. bold measures areshorter than 10 minutes

100% correct « 90% correct significantQκpf |κ “ 1.0q Qκpf |κ ě 0.9q Qppf | p ă 0.05q

Parameter N D DA N D DA N D DAMAD 30 mRMS 12 h 12 h 4 h 2 h 12 h 6 mSpectrum 5 h 5 h 2 h 3 h 2 h 8 m 8 m 2 mSpectrum dB 5 h 2 h 1 h 12 m 42 m 6 m 60 s 60 s 30 s

The quality measures Qκ and Qp defined in 2.6.3 measure how long segments that are needed toobtain a separation between the strains. Table 6 list their values for the EEG parameters where asignificant difference clear enough to completely separate the strains was found for at least some subset.See table 1 for a description on the subsets over which the quality measures were evaluated.

The quality level Qκpf |κ “ 1.0q represent disjoint parameter distributions in the available data. Theorigin of each recording could thus be classified to the correct strain. The quality level Qκpf |κ “ 0.9qrepresent that Cohen’s kappa of the strain classification is 0.9, which can be interpreted as 90% correctclassification after accounting for correct classification by chance.

A significant difference clear enough to completely separate the strains was found for some subset inall remaining parameters MAD, RMS, Spectrum and Spectrum dB. The Spectrum parameter produced astrain probability whose average p-value was significant with mean p ă 0.05 for random segments only acouple minutes long during the Day awake subset (DA). It is worth noting that the strains were differentenough during Day awake that this difference became apparant by just looking at the RMS power ofthe signal. The MAD parameter is also a way of measuring the power of the signal. MAD was used fornormalizing the recording but it was still a significant parameter for separating the strains in the Dayawake subset.

31

Page 32: Johan Gustafsson - DiVA portal

Figure 13

Figure 11 shows that the SD strain has more power in gamma (30 – 60 Hz) than FSL during awake,whereas the FSL strain has more power than SD in delta (1.5 – 4 Hz) during sleep. The right figuresshow how the spectral content varies between the different sleep states. Individual variances are largerthan the variations between the groups of each strain. All data are medians from baseline recordings.

The FSL strain has stronger gamma (30 – 80 Hz) than the SD strain. Each 10 second epoch of thesignal was divided into one of four states; awake, rapid eye movement (REM), non-REM (NREM) orundefined/artefact. This sleep scoring was performed manually on the waveform trace of EEG and EMGdata by Peter Stienen. Figure 11 shows how the power spectrum varies between the strains as well ashow it varies between sleep states within each strain. The EEG is highly non-stationary, activity rathertends to happen in bursts of activity within a limited frequency band. What figure 11 illustrates is that,for instance, a segment of the data during awake will contain stronger and/or more bursts of gammaactivity than a segment of sleep.

3.6 Ketamine effectsThe treatment effects of Ketamine with respect significantly increased similarity to SD is listed in table 7.Note that more effects of Ketamine show up two days after administration compared to 8-10 hoursafterwards.

Night has a different score compared to day p “ 0.0001 so it makes sense to study an interactionmodel on subset. No significant difference in score was found during the day awake subset. However, asignificance was found during night (and day).

Table 7 shows linear models of the binomial classification score based on the Spectrum dB parameterfor the same night as administraion and two nights later. The linear model was built by testing forcorrelation with ketamine administration levels, strain class and their interaction effects. This setupis bound to find some correlation, as the total number of effects tested for is high and the number ofobservations is low. So interpret these correlatinos with care. The intercept is SD during the baselinenight. F refer to a significant correlation between the classification score and the FSL strain. Ki referto a significant correlation between the classification score and ketamine administration.

Table 7: Linear regression of strain score. R2 (3) measures how well the strain score is explained by thelinear score model while taking potential overfitting into account. Ketamine was given in the beginningof subjective day 3, 2 hours after the lights were turned on (when the rats are supposed to go to sleep).

Parameter Subset Linear score model R2

Spectrum dB Night 3 „ 0.96F 0.97Spectrum dB Night 5 „ p0.42K0 ` 0.51K10 ` 0.33K30qS ` 1.00F 0.73

A correlation with all ketamine levels means that in the SD strain the spectrum was significantlydifferent between the baseline night (night number 2) and night number 5.

Figure 14 shows how the mean spectra varies after Ketamine administration during night and dayseparately. The samples were too disperse to do any reliable statistical analysis so the plots in figure 14instead show the mean of each individual. Figure 15 shows the mean of sleep during night, and awakeduring day respectively. The sleep scoring was performed by training on the baseline. As the averagespectrum seems to vary between the baseline and subsequent days it is possible that this made the sleepscoring less accurate. A likely cause of the variation in average spectrum is however a variation in sleeppatterns.

32

Page 33: Johan Gustafsson - DiVA portal

10 100

Nig

ht [d

B]

-1

0

1 SD

FSL

0-24 h [Hz]10 100

Day [dB

]

-1

0

1

10 100

-1

0

1

Ketamin = 0 mg/kg

24-48 h [Hz]10 100

-1

0

1

10 100

-1

0

1

48-72 h [Hz]10 100

-1

0

1

10 100

Nig

ht [d

B]

-1

0

1 SD

FSL

0-24 h [Hz]10 100

Day [dB

]

-1

0

1

10 100

-1

0

1

Ketamin = 10 mg/kg

24-48 h [Hz]10 100

-1

0

1

10 100

-1

0

1

48-72 h [Hz]10 100

-1

0

1

10 100

Nig

ht [d

B]

-1

0

1 SD

FSL

0-24 h [Hz]10 100

Day [dB

]

-1

0

1

10 100

-1

0

1

Ketamin = 30 mg/kg

24-48 h [Hz]10 100

-1

0

1

10 100

-1

0

1

48-72 h [Hz]10 100

-1

0

1

Figure 14: Spectrum variations of different animals after Ketamine administration compared to the nightand day subsets of the 24h baseline respectively.

10 100

Nig

ht asle

ep [dB

]

-1

0

1 SD

FSL

0-24 h [Hz]10 100

Day a

wake [dB

]

-1

0

1

10 100

-1

0

1

Ketamin = 0 mg/kg

24-48 h [Hz]10 100

-1

0

1

10 100

-1

0

1

48-72 h [Hz]10 100

-1

0

1

10 100

Nig

ht asle

ep [dB

]

-1

0

1 SD

FSL

0-24 h [Hz]10 100

Day a

wake [dB

]

-1

0

1

10 100

-1

0

1

Ketamin = 10 mg/kg

24-48 h [Hz]10 100

-1

0

1

10 100

-1

0

1

48-72 h [Hz]10 100

-1

0

1

10 100

Nig

ht asle

ep [dB

]

-1

0

1 SD

FSL

0-24 h [Hz]10 100

Day a

wake [dB

]

-1

0

1

10 100

-1

0

1

Ketamin = 30 mg/kg

24-48 h [Hz]10 100

-1

0

1

10 100

-1

0

1

48-72 h [Hz]10 100

-1

0

1

Figure 15: Spectrum variations of different animals after Ketamine administration compared to the non-awake night and awake day subsets of the 24h baseline respectively.

4 DiscussionThe results mean that the EEG signals from FSL and SD are different in several aspects defined asparameter estimates through signal processing of the signal. It is possible that some of these parametersare translational biomarkers for clinical depression. As the method of EEG is non-invasive and EEGequipment is readily available in clinical practices it is possible to search for such biomarkers in humansas well.

4.1 FreqWhen working with the dataset it was of great help to use Freq for visualizing the signal. Some patternsbecame clear that would not have been apparent otherwise. It was also natural to use the visualizationfor navigation in the data when studying and discussing events throughout the data. The memory usagepatterns were better investigated and significantly improved. The development of the iPad-version ofFreq was a great stress test environment were several issues were easier to pin-point due to the morerestrictive graphics and processing environment. The desktop implementation of OpenGL was verypermissive compared to the tablet implementation, making some issues with multithreaded processinghard to locate on the desktop but very apparent on the tablet. The desktop version of OpenGL alsopermitted the use of the legacy OpenGL Application programming interface (API) which can have acomparatively poor performance on modern graphics card. The modern OpenGL API is harder to abusewhich forces the appliction to be fairly efficient. The development of the tablet version made sure theentire code base was updated to not use any of the deprecated functions. When it came to large datasetsthe desktop version of Freq would simply start paging when using the memory inefficiently whereas thetablet operating system would simply kill the application. This forced a redesigning of memory allocationin Freq o avoid paging when working with datasets of a size approaching, or even surpassing, the amountof available Random access memory (RAM). This redesign also had the side effect of keeping better

33

Page 34: Johan Gustafsson - DiVA portal

track of, and reusing, memory allocations which significantly reduced cache misses and thus boosted theperformance as well.

4.2 PreprocessingThe variations in signal power between the weakest and strongest recording was about 10 dB (figure 6).Sources of this may be variations in recording setup with varying conductivity, such a variation shouldhave been random and there is a low probability that this variation correlated with the strains by chance(p “ 0.01 for MAD2 and RMS2 in table 5). FSL rats are somewhat smaller than SD. This can have theeffect that EEG events occur closer to the electrode and are thus not being attenuated as much. Whichwould explain why the signal power from FSL is stronger.

4.3 DecompositionFigure 10 validates several of the frequency bands listed in table 3. Activity in such bands are indeedoccuring in the EEG signal. But the activity is not strictly confined to only those bands as a naturaleffect of the signal not being a pure sinusoidal wave to start with. This means that a bandpass filterthat tries to capture the events within a given frequency band should not have a sharp cut-off and bevary of leakage from other frequency bands.

4.4 SleepSD and FSL were scored with equal amounts of REM sleep, in contrast to previously found results whichsay that FSL should have increased REM sleep [22].

It is possible to score sleep fairly well only using the EEG without studying EMG data. It wouldprobably be more accurate however to distinguish Awake from REM by using EMG information as well.

While automated sleep scoring is not new it usually takes both EEG and EMG into account to clearlyseparate awake from REM sleep but the method used here relies on EEG data only. The method usedhere does however require a training dataset. Future work may build automated sleep scoring with asmaller training dataset solely based on which frequency band that is the strongest in a normalizedsense. As sleep scoring is a time consuming and common task in a lot of experiments such a sleep scoringalgorithm could quicken turnaround time.

As illustrated by the figures almost any band can be used to distinguish between awake and sleep.From 1 Hz to 20 Hz the median power is stronger during sleep than during awake. And from 40 Hzand above the median power is weaker during sleep than during awake. It is on the other hand lessstraightforward to separate between REM and NREM sleep states.

One interpretation of the variations in sleep patterns is that while the depressed model, FSL, arealways having a hard time to get a good nights sleep whereas the non-depressed model, SD, usually hasa good nights sleep but not always. They might for instance be woken up by other (possibly depressed)rats rambling about in the animal facility during night. No test was performed to see if there was anycorrelation in events over time between the recordings.

4.5 Strain separationNo variation in alpha power was found. However the alpha band is not a translational band so it mightnot necessairly mean anything that the variation in alpha power commonly seen in humans was notfound in the FSL rats.

The difference between the strains in power during awake is significant for a 6 minute long segment.But no classifier was created that could say with certainty that a recording segment shorter than 1 hourcomes from SD or FSL.

No effect of Ketamine could be seen that indicated that a parameter estimated for an FSL rat wasestimated closer to the SD value after treatment. Rather the opposite was seen as SD was scored moreor less 50/50 as SD or FSL after the Ketamine administration. But that can perhaps be construed asp-hacking as the number of observations are small compared to the number of effects tested for.

34

Page 35: Johan Gustafsson - DiVA portal

4.6 ConclusionThis study shows that it is possible to separate an animal model of depression from an animal modelof non-depression based on its EEG and that EEG-classifiers may work as indicative classifiers fordepression. Not a lot of data is needed. Further studies are needed to verify that the results are notoverly sensitive to recording setup and to study to what extent the results are translational. It mightbe some of the EEG parameters with significant differences described here are limited to describe thedifference between the two strains FSL and SD. But the classifiers have reasonable biological explanationsthat makes them good candidates for being translational EEG-based classifiers for depression.

4.7 Future workMore animals and more channels could provide a better certainty in results. The method of strain sepa-ration based EEG parameters should be tested with more strains to separate properties in comorbidity.This study used inbred strains were the severity of depression was assumed to be constant. In humansthe severity of depression varies continuously and with more data it might be possible to model thisthrough a linear regression over severity instead of using a nominal model with only two extreme cases.

5 AcknowledgmentsThanks to everyone hanging out in the lab. Thank you Carro, Caitlin, Ipsit, Jared, Josje, Niten, Pauland Salvatore for explaining biology to an engineer, being my sounding board and jumping in as jugglingpartners during breaks. Thanks to Peter Stienen for giving me insight into various ways to approachthe EEG recordings. Thanks to Pawel Herman for looking at my intermediate results and enlighten mewith new analysis methods. None of this would have been possible without the support from my mentorMaria Lindskog, thank you!

6 References

References[1] World Health Organization: WHO. Depression. Web Page. 2012. url: http://www.who.int/

mediacentre/factsheets/fs369/en/ (cit. on p. 6).

[2] mayoclinic.org. Depression (major depressive disorder) - Tests and diagnosis. Web Page. url:http://www.mayoclinic.org/diseases-conditions/depression/basics/tests-diagnosis/con-20032977 (cit. on p. 6).

[3] webmd. Depression Diagnosis. Web Page. url: http://www.webmd.com/depression/guide/depression-diagnosis (cit. on p. 6).

[4] M Lindskog. Lindskog Laboratory. Web Page. 2015. url: http://ki.se/en/neuro/lindskog-laboratory (cit. on p. 6).

[5] T F Collura. “History and evolution of electroencephalographic instruments and techniques”. In:J Clin Neurophysiol 10.4 (1993), pp. 476–504. url: http://www.ncbi.nlm.nih.gov/pubmed/8308144 (cit. on p. 7).

[6] Fredrick Lemere. The significance of individual differences in the Berger rhythm. Vol. 59. 1936,pp. 366–375. url: http://brain.oxfordjournals.org/brain/59/3/366.full.pdf (cit. onp. 7).

[7] G. d’Elia, B. Laurell, and C. Perris. “EEG Photically elicited alpha blocking responses in depressivepatients before and after convulsive therapy”. In: Acta Psychiatr Scand Suppl 255 (1974), pp. 159–72 (cit. on p. 7).

[8] brainclinics.com. History of neurophysiological findings in depression. Web Page. url: http://www.brainclinics.com/depresssion-history-of-eeg-research (cit. on p. 7).

35

Page 36: Johan Gustafsson - DiVA portal

[9] G. Buzsaki and B. O. Watson. “Brain rhythms and neural syntax: implications for efficient coding ofcognitive content and neuropsychiatric disease”. In: Dialogues Clin Neurosci 14.4 (2012), pp. 345–67. url: http://www.ncbi.nlm.nih.gov/pubmed/23393413, http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3553572/pdf/DialoguesClinNeurosci-14-345.pdf (cit. on p. 7).

[10] György Buzsáki, Costas A Anastassiou, and Christof Koch. “The origin of extracellular fieldsand currents–EEG, ECoG, LFP and spikes”. In: Nat Rev Neurosci 13.6 (2012), pp. 407–20. url:http://dx.doi.org/10.1038/nrn3241, http://www.nature.com/nrn/journal/v13/n6/pdf/nrn3241.pdf (cit. on pp. 7, 10, 27).

[11] Martijn Arns and Sebastian Olbrich. “Two EEG channels do not make a ’quantitative EEG(QEEG)’: a response to Widge, Avery and Zarkowski (2013)”. In: Brain Stimul 7.1 (2014), pp. 146–8. url: http://dx.doi.org/10.1016/j.brs.2013.09.009, http://ac.els- cdn.com/S1935861X1300291X/1- s2.0- S1935861X1300291X- main.pdf?_tid=f9d10cd0- 59fe- 11e4-be8a-00000aab0f6c&acdnat=1413991417_f69d83c3e691e76299ac534ca25c4a9a (cit. on p. 7).

[12] G Adler, A Bramesfeld, and A Jajcevic. “Mild cognitive impairment in old-age depression is as-sociated with increased EEG slow-wave power”. In: Neuropsychobiology 40.4 (1999), pp. 218–22.url: http://dx.doi.org/26623 (cit. on p. 7).

[13] L S Prichep and E R John. “QEEG profiles of psychiatric disorders”. In: Brain Topogr 4.4 (1992),pp. 249–57. url: http://www.ncbi.nlm.nih.gov/pubmed/1510868 (cit. on p. 7).

[14] I A Cook and A F Leuchter. “Prefrontal changes and treatment response prediction in depression”.In: Semin Clin Neuropsychiatry 6.2 (2001), pp. 113–20. url: http://www.ncbi.nlm.nih.gov/pubmed/11296311 (cit. on p. 7).

[15] Martin Bares, Martin Brunovsky, Miloslav Kopecek, Pavla Stopkova, Tomas Novak, Jiri Kozeny,and Cyril Höschl. “Changes in QEEG prefrontal cordance as a predictor of response to antidepres-sants in patients with treatment resistant depressive disorder: a pilot study”. In: J Psychiatr Res41.3-4 (2007), pp. 319–25. url: http://dx.doi.org/10.1016/j.jpsychires.2006.06.005(cit. on p. 7).

[16] Miloslav Kopecek, Peter Sos, Martin Brunovsky, Martin Bares, Pavla Stopkova, and Vladimir Kra-jca. “Can prefrontal theta cordance differentiate between depression recovery and dissimulation?”In: Neuro Endocrinol Lett 28.4 (2007), pp. 524–6. url: http://www.ncbi.nlm.nih.gov/pubmed/17693989 (cit. on p. 7).

[17] M Gómez-Galán, D De Bundel, A Van Eeckhaut, I Smolders, and M Lindskog. “Dysfunctionalastrocytic regulation of glutamate transmission in a rat model of depression”. In: Mol Psychiatry18.5 (2013), pp. 582–94. url: http://dx.doi.org/10.1038/mp.2012.10, http://www.nature.com/mp/journal/v18/n5/pdf/mp201210a.pdf (cit. on p. 8).

[18] M. Lindskog. “A forward translational project to segment depression and suggest treatment basedon EEG”. 2014 (cit. on p. 8).

[19] Mentis Cura. Mentis Cura. Web Page. url: http://www.mentiscura.com/ (cit. on p. 8).

[20] Jon Snaedal, Gisli Holmar Johannesson, Thorkell Eli Gudmundsson, Nicolas Petur Blin, AsdisLilja Emilsdottir, Bjorn Einarsson, and Kristinn Johnsen. “Diagnostic accuracy of statistical pat-tern recognition of electroencephalogram registration in evaluation of cognitive impairment anddementia”. In: Dement Geriatr Cogn Disord 34.1 (2012), pp. 51–60. url: http://dx.doi.org/10.1159/000339996, http://www.karger.com/Article/Pdf/339996 (cit. on p. 8).

[21] David H Overstreet and Gregers Wegener. “The flinders sensitive line rat model of depression–25years and still producing”. In: Pharmacol Rev 65.1 (2013), pp. 143–55. url: http://dx.doi.org/10.1124/pr.111.005397, http://pharmrev.aspetjournals.org/content/65/1/143.full.pdf(cit. on p. 8).

[22] D H Overstreet. “The Flinders sensitive line rats: a genetic animal model of depression”. In: NeurosciBiobehav Rev 17.1 (1993), pp. 51–68. url: http://www.ncbi.nlm.nih.gov/pubmed/8455816,http://www.sciencedirect.com/science/article/pii/S0149763405802301 (cit. on pp. 8, 34).

[23] DSI. DSI F40-EET transmitter specifications. Web Page. 2014. url: http://www.datasci.com/products/implantable-telemetry/specification-overview (cit. on p. 10).

36

Page 37: Johan Gustafsson - DiVA portal

[24] [email protected]. F40 transmitter. Personal Communication. Aug. 2014 (cit. on p. 10).

[25] J Dauwels, F Vialatte, and A Cichocki. “Diagnosis of Alzheimer’s disease from EEG signals: whereare we standing?” In: Curr Alzheimer Res 7.6 (2010), pp. 487–505. url: http://www.ncbi.nlm.nih.gov/pubmed/20455865 (cit. on pp. 14, 18, 27).

[26] Christophe Leys, Christophe Ley, Olivier Klein, Philippe Bernard, and Laurent Licata. “Detect-ing outliers: Do not use standard deviation around the mean, use absolute deviation around themedian”. In: Journal of Experimental Social Psychology 49.4 (2013), pp. 764–766. url: http://www.sciencedirect.com/science/article/pii/S0022103113000668, http://ac.els-cdn.com/S0022103113000668/1-s2.0-S0022103113000668-main.pdf?_tid=5c2182d2-5ecf-11e4-9b52-00000aab0f6c&acdnat=1414520722_a1066cce375edc71f79f1c2ae6671b9e (cit. onp. 17).

[27] R. Mitchell Parry and Irfan Essa. Incorporating phase information for source separation via spec-trogram factorization. Report. Georgia Institute of Technology (cit. on p. 17).

[28] Y. Li and A. Ngom. “The non-negative matrix factorization toolbox for biological data mining”. In:Source Code Biol Med 8.1 (2013), p. 10. url: http://www.ncbi.nlm.nih.gov/pubmed/23591137,http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3736608/pdf/1751-0473-8-10.pdf (cit. onp. 18).

[29] C. Vakalopoulos. “The EEG as an index of neuromodulator balance in memory and mental illness”.In: Front Neurosci 8 (2014), p. 63. url: http://www.ncbi.nlm.nih.gov/pubmed/24782698,http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3986529/pdf/fnins-08-00063.pdf (cit. onp. 27).

37

Page 38: Johan Gustafsson - DiVA portal

Appendix

A Time-frequency plots of recorded dataAppendix, with a listing of the studied data as time frequency plots through continuous wavelet trans-forms (CWT).

Time-frequency plots of single channel EEG recordings on the skull above the prefrontal cortex from16 rats. There are individual variations but the recording location is not exact which causes additionalvariations and is likely the explanation for strong unique components or absence of otherwise commoncomponents. Images show a continuous wavelet transform with 40 scales per octave produced by Freq.The x-axis spans 6 full days with 500 samples per second. The y-axis spans 1 to 100 Hz logarithmically.Darker shades represent more power in a given time-frequency location. The full resolution continuouswavelet transform is downsampled to visible pixels by using a local max of the area each pixel represents.

A.1 Sprague Dawley

The animal IDs used in this study are in order: R2, R3, R4, R5, R6, R8, R9, R15.

1 2 3 4 5 6Days

2

10

100

Hz

CWT of EEG (R2)

1 2 3 4 5 6Days

2

10

100H

zCWT of EEG (R3)

1 2 3 4 5 6Days

2

10

100

Hz

CWT of EEG (R4)

1 2 3 4 5 6Days

2

10

100

Hz

CWT of EEG (R5)

1 2 3 4 5 6Days

2

10

100

Hz

CWT of EEG (R6)

1 2 3 4 5 6Days

2

10

100

Hz

CWT of EEG (R8)

1 2 3 4 5 6Days

2

10

100

Hz

CWT of EEG (R9)

1 2 3 4 5 6Days

2

10

100

Hz

CWT of EEG (R15)

A.2 Flinders sensitive line

The animal IDs used in this study are in order: R10, R11, R12, R13, R14, R16, R17, R18

1 2 3 4 5 6Days

2

10

100

Hz

CWT of EEG (R10)

1 2 3 4 5 6Days

2

10

100

Hz

CWT of EEG (R11)

38

Page 39: Johan Gustafsson - DiVA portal

1 2 3 4 5 6Days

2

10

100

Hz

CWT of EEG (R12)

1 2 3 4 5 6Days

2

10

100

Hz

CWT of EEG (R13)

1 2 3 4 5 6Days

2

10

100

Hz

CWT of EEG (R14)

1 2 3 4 5 6Days

2

10

100

Hz

CWT of EEG (R16)

1 2 3 4 5 6Days

2

10

100

Hz

CWT of EEG (R17)

1 2 3 4 5 6Days

2

10

100

Hz

CWT of EEG (R18)

39