Modifiable drone thermal imaging analysis framework for mob … · 2018. 10. 15. · plugin system....
Transcript of Modifiable drone thermal imaging analysis framework for mob … · 2018. 10. 15. · plugin system....
-
Brecht Verhoeve
mob detection during open-air eventsModifiable drone thermal imaging analysis framework for
Academic year 2017-2018Faculty of Engineering and ArchitectureChair: Prof. dr. ir. Bart DhoedtDepartment of Information Technology
Master of Science in Computer Science Engineering Master's dissertation submitted in order to obtain the academic degree of
Counsellors: Pieter-Jan Maenhaut, Jerico MoeyersonsSupervisors: Prof. dr. Bruno Volckaert, Prof. dr. ir. Filip De Turck
-
Brecht Verhoeve
mob detection during open-air eventsModifiable drone thermal imaging analysis framework for
Academic year 2017-2018Faculty of Engineering and ArchitectureChair: Prof. dr. ir. Bart DhoedtDepartment of Information Technology
Master of Science in Computer Science Engineering Master's dissertation submitted in order to obtain the academic degree of
Counsellors: Pieter-Jan Maenhaut, Jerico MoeyersonsSupervisors: Prof. dr. Bruno Volckaert, Prof. dr. ir. Filip De Turck
-
iv
Permission for usage
”The author gives permission to make this master dissertation available for consultation and to copy parts of this master dis-
sertation for personal use. In the case of any other use, the copyright terms have to be respected, in particular with regard to
the obligation to state expressly the source when quoting results from this master dissertation.”
Brecht Verhoeve
Ghent, June 2018
-
v
Preface
This master dissertation is submitted as a completion of the academic degree as Master of Science in Computer Science Engi-
neering at Ghent University. The dissertation investigates the upcoming combination of drones and thermal cameras, their use
cases and supporting technologies. The dissertation led me through various fields such as software architecture, microservices,
software containerization, GPUs and neural networks. I wrote the dissertation focussing on the business and technological
aspects that could lead to increasing industry adoption of these technologies.
I would like to thank my supervisors and counsellors for their continuous support this year. You were always there for a quick
meeting during which the atmosphere was always positive, jokes were always around the corner but with a focus on results.
Prof. Volckaert for the quickest replies on emails I have yet witnessed to this day and guiding me through the complex journey
of this dissertation. Jerico Moeyersons for the office hop-ins and help during that annoying CUDA installation. Pieter-Jan
Maenhaut for his questions and reviews during meetings, which provided me with new insights and things to write about. Nils
Tijtgat for the support in the early days of the thesis, I’ve read your tutorial on YOLO more than I would like to admit. And finally
Prof. De Turck for the opportunity of working on this topic.
I am grateful for the company I had this year when working on the dissertation. Ozan Catal, Joran Claeys, Stefan Wauters, Dries
Bosman, Pieter De Cleer, Igor Lima de Paula, Laura Van Messem, Lars De Brabandere, Stijn Cuyvers, Stijn Poelman, thank you for
the fun times, spontaneous beers and support this year!
Special thanks go out to the people of the VTK and FK students associations. You provided me with unforgettable experiences,
friendships, teachings andmemories. With a special mention to Stéphanie, Anna and Nick from Career & Development, everyone
from Delta, and finally Stijn Adams and Sander De Wilde for their continuous support throughout the years.
Finally, I want to thank my parents and Marjolein Hondekyn for their advise and massive support. Without you, I wouldn’t have
been able to wrestle myself through the tough periods and finish the dissertation!
Brecht Verhoeve
Ghent, June 2018
-
vi
Modifiable drone thermal imaging analysis framework for mob detection during
open-air events
Brecht Verhoeve
Supervisors: Prof. dr. Bruno Volckaert, Prof. dr. ir. Filip De Turck
Counsellors: Pieter-Jan Maenhaut, Jerico Moeyersons
Master’s dissertation submitted in order to obtain the academic degree of
Master of Science in Computer Science Engineering
Department of Information Technology
Chair: Prof. dr. ir. Bart Dhoedt
Faculty of Engineering and Architecture
Ghent University
Academic year 2017-2018
Abstract
Drones and thermal cameras are used in combination for many applications such as search and rescue, fire fighting, etc. Due to
vendor specific hardware and software, applications are hard to develop and maintain. Therefore a modifiable drone thermal
imaging analysis framework is proposed that enables users to more easily develop such image processing applications. It
implements a microservice plugin architecture. Users can build image processing applications with the framework by building
media streams using plugins that are either thermal cameras or image analysis software modules. The framework is evaluated
by building a proof of concept implementation which is tested on the initial requirements. It achieves the modifiability and
interoperability requirements at the cost of performance and security. The framework is applied for detecting large crowds of
people (mobs) during open-air events. A new dataset containing thermal images of such mobs is presented, on which a YOLOv3
neural network is trained. The trained model is able to detect mobs on new thermal images in real-time achieving frame rates
of 55 frames per second when deployed on a modern GPU.
Keywords: Drone thermal imaging, Video streaming, Framework, Microservices, Object de-
tection, Plugin
-
Modifiable Drone Thermal Imaging AnalysisFramework for Mob Detection during Open-Air
EventsBrecht Verhoeve
Supervisors: prof. dr. Bruno Volckaert, prof. dr. ir. Filip De Turck, Pieter-Jan Maenhaut, Jerico Moeyersons
Abstract— Drones and thermal cameras are used in combination formany applications such as search and rescue, fire fighting, etc. Due to ven-dor specific hardware and software, applications are hard to develop andmaintain. Therefore a modifiable drone thermal imaging analysis frame-work is proposed that enables users to more easily develop such imageprocessing applications. It implements a microservice plugin architecture.Users can build image processing applications with the framework by build-ing media streams using plugins that are either thermal cameras or imageanalysis software modules. The framework is evaluated by building a proofof concept implementation which is tested on the initial requirements. Itachieves the modifiability and interoperability requirements at the cost ofperformance and security. The framework is applied for detecting largecrowds of people (mobs) during open-air events. A new dataset containingthermal images of such mobs is presented, on which a YOLOv3 neural net-work is trained. The trained model is able to detect mobs on new thermalimages in real-time achieving frame rates of 55 frames per second whendeployed on a modern GPU.
Keywords— Drone thermal imaging, Video streaming, Framework, Mi-croservices, Object detection, Plugin
I. INTRODUCTION
THROUGHOUT history, having an overview of the environ-ment from high viewpoints held many benefits. The adventof drones and advanced cameras provides low-cost aerial imag-ing that creates numerous opportunities for new applications.Traditional visual cameras for the visible light spectrum offerhigh quality images, but are limited to daytime or artificiallylighted scenes. Thermal cameras measure thermal radiation ofobjects in a scene and thus can operate in utter darkness, reveal-ing information not visible to the normal eye [1]. The combi-nation of drones and thermal cameras is used in many differentapplications such as geography [2,3], agriculture [4], search andrescue [5], wildlife monitoring [6], disaster response [7], main-tenance [8], etc.
Several vendors offer thermal camera products, some specif-ically designed for drone platforms. These cameras often usedifferent image formats, color schemes and interfaces [1,9–11].This leads to issues if applications want to change the camerathat is used or when the camera is no longer supported by thevendor, because different software needs to be built to inter-act with the new camera, which often is a non-negligible cost.This leads to a problem called vendor lock-in that makes cus-tomers dependent on a certain vendor as they cannot switchproduct without making substantial costs, a problem alreadyvery tangible for cloud-based applications [12]. Applicationsacross various fields often have slightly different functional andnon-functional requirements. For this dissertation several Bel-gian fire fighting departments were asked for requirements for
a thermal drone platform application. It quickly became clearthat they had various problems that needed to be solved, suchas finding hot explosives, measuring temperatures in contain-ers, identifying hot entrances, detecting invisible methane fires,finding missing persons, etc. Some use cases need to be eval-uated in real-time (during fires), others need to be extremelyaccurate. A hypothetical application should be able to quicklyimplement new detection and analysis features to meet all theserequirements. Due to the current solutions not being modifiableenough, current applications built with aerial thermal imagingtechnology remain in the niche use case for which they were ini-tially developed [13]. Applications could benefit from a back-bone framework to aid in this modifiability/interoperability is-sue, aiding in developing end-to-end solutions connecting ther-mal cameras to various analysis/detection modules.
This dissertation explores the requirements for such a frame-work and its potential software architecture. To test the viabil-ity of the architecture, a proof of concept prototype is imple-mented and evaluated against the initial requirements. To verifyif it aids in developing detection applications, the specific usecase for detecting large crowds of people, so-called mobs dur-ing open-air events is investigated. Monitoring crowds duringopen-air events is important, as mobs can create potentially dan-gerous situations through bottlenecks, blocking escape routes,etc. Through monitoring and detecting these mobs, these situa-tions can be avoided before they become problematic [14, 15].
The remainder of this paper is organized as follows. Sec-tion II presents similar projects on the topic of modifiable imag-ing analysis frameworks and thermal object detection. SectionIII presents the requirements of the framework and the soft-ware architecture designed from these requirements. Section IVpresents the implementation of the framework prototype. Themob detection experiment is described in Section V. The testsand results to evaluate the framework and the mob detection ex-periment are presented in Section VI. Finally, Section VII drawsconclusions from this research and indicates where future effortsin this field should go to.
II. RELATED WORK
The Irish start-up DroneSAR [16] developed a search-and-rescue (SAR) drone platform allowing users to stream live im-ages and video from a drone as it conducts a search for miss-ing persons. The platform works with any camera, visual andthermal but focuses on drones from vendor DJI, DroneSARs in-dustry partner. Amazon introduced the Amazon Kinesis Video
-
Streams platform in January 2018 as a new service for the Ama-zon Web Services (AWS) cloud platform. It allows users tostream live video from devices to the AWS cloud and buildapplications for real-time video processing [17]. The VIPERproject by EAVISE, KU Leuven researched how thermal andvisual video images could be used for real-time detection ofpersons using object detection algorithms such as deep learning[18]. The framework presented in this work combines elementsfrom all three of these examples.
III. REQUIREMENTS AND SOFTWARE ARCHITECTURE
A. Functional requirements
Three general actors are identified for the framework: an end-user that wants to build a image processing application for a spe-cific use case, camera developers integrating new cameras intothe framework, and analysis/detection module developers inte-grating new analysis/detection algorithms into the framework sothat end-users can use them to build their applications. An end-user should be able to construct a image processing applicationusing the integrated cameras and analysis modules. He shouldbe able to adapt this application with the framework for newuse cases. Camera and analysis module developers should beable to add and distribute new integrations for new cameras andanalysis modules to the framework. This allows the end-users tofocus on the use case, not the technical details of the hardwareplatforms or algorithms and have a wider selection of hardwareand algorithms.
B. Non-functional requirements
Interoperability, modifiability, and peformance are identifiedas the architecturally significant requirements. Interoperabilityspecifies that the framework should be able to interact with var-ious cameras and analysis software modules via interfaces. Theamount of systems the framework can successfully interact withadding to the business value of the framework, as end-users canuse more devices via the framework to build applications. Theframework needs to be extendable with new thermal camerasand analysis modules. Applications built with the frameworkshould be modifiable to integrate new hardware and software.The available hardware on-site for use cases such as forest firemonitoring is not always powerful enough to support heavy im-age analysis software. The framework should be able to deployin a distributed fashion, to allow more computationally expen-sive operations to be executed on more powerful remote devices.Some use cases require real-time streaming of video and manip-ulation of these video streams, which should be supported forthe framework to be relevant.
C. Software architecture
An architectural pattern analysis based on the requirementspresented in Section III-B was conducted, from which a com-bination of the microservices and microkernel pattern was se-lected as the base pattern for the software architecture. The mi-crokernel pattern enables the framework to be extended via aplugin system. The microservices framework enables interop-erability through the use of standard interfaces and a communi-cation scheme. It also allows for the framework to be deployed
in a distributed fashion [19–21]. The software architecture isdocumented in static views, sequence diagrams and deploymentviews. Figure 1 presents an overview of the architecture.
Fig. 1: Component-connector overview of the framework. Theclear components are the core components of the framework thateach user needs to install to use the framework. The coloredcomponents are used for the distribution of plugins.
End-users interact with the framework via the Client Inter-face, a graphical or textual interface. Commands are forwardedto the Stream module that manages the layout and state of thecomponents in the media pipeline (stream) that the end-userbuilds for his use case. To activate and place the componentsin a certain layout, the Stream module connects with the Pro-ducer and Consumer components that control the plugins thatprocess the media. Producer Plugins are devices that producemedia, such as thermal cameras. Consumer Plugins process andconsume media, such as analysis software and displays. Oncea stream is established, the plugins forward media to each otherin the layout specified by the Stream module. New support forcameras and analysis modules can be added as plugins to theProducer/Consumer Distribution components that distribute thissoftware so that end-users can download and install the plugins.Each module in the architecture is a microservice, allowing fora distributed deployment and enabling the modifiability require-ments through interchangeable modules. Cameras and analy-sis modules are realized as plugins for the Producer/Consumermodules implemented as a microkernel. This allows the frame-work to easily build new streams for different use cases and in-terchange components when needed.
C.1 Plugin model
Figure 2 depicts the model of a general framework plugin.The plugin defines three interfaces: a source media endpoint toreceive media from different sources, a listener endpoint to for-ward the processed media to other listening plugins and an APIfor framework control. The framework uses the API to changewhich sources and listener a plugin has and its state. By linkingplugins together by setting the sources and listeners resources,the framework can build a media processing stream. ProducerPlugins have no sources since they produce media. The statesare used stop and start the media processing of the plugins in
-
Fig. 2: Schematic overview of a plugin.
the stream. The REST paradigm is selected to build this API,with /state, /sources and /listeners resources that need to be min-imally implemented. A plugin has the following states: INAC-TIVE, PLAY, STOP and PAUSE. Figure 3 depicts the state tran-sition diagram. A plugin implements the visible states STOP,PAUSE and PLAY describing if the media process of the pluginis stopped, paused or processing respectively. The INACTIVEstate is only visible to the framework as it indicates that their isno active process that runs the plugin. This is the initial state ofa plugin in the framework. When a plugin is added to a stream,the plugin microservice is started, transitions to the STOP stateand waits for commands.
Fig. 3: State transition diagram of a plugin.
C.2 Network topology and communication protocol
The microservices of the framework and the plugins need acommunication protocol to exchange commands and video. Forsending the commands the HTTP/TCP protocol is used, a syn-chronous protocol that blocks on the response of a request. Thisensures that commands are always received and acknowledgedincreasing the reliability of the communication at the cost ofperformance [22]. The asynchronous RTP/UDP protocol is se-lected to transfer media between the plugins to ensure low la-tency video transfer between plugins to enable real-time videostreams. The recommended codec for transmitting video mediais MJPEG, which transmits video frames as separately encodedJPEG images [23]. Other codecs such as MPEG-4 encode onlycertain frames as keyframes, and the other frames as B-framesthat encode differences from the keyframe [24]. This impliesthat when receiving images from a stream, a keyframe must firstbe received before the video can be decoded. Using MJPEG
plugins receiving frames can directly perform analysis on eachkeyframe and do not have to wait for additional frames. An ex-ample of this network topology for a stream with 2 Producerplugins and 3 Consumer plugins is depicted in Figure 4
Fig. 4: Network topology. The full lines represent HTTP/TCPcommuncations, the dashed line RTP/UDP communications.
IV. PROTOTYPE IMPLEMENTATION
The goal of the prototype implementation is to provide a proofof the concept architecture and meet the requirements presentedin Section III. The core framework components are imple-mented, the distribution components are left out of scope as theyfocus primarily on the distribution of supported plugins. Thecore microservices as well as the plugins are implemented usingthe Docker software containerization framework [25]. Contain-ers virtualize on the operating system and allow for portable,lightweight software environments for processes with a minorperformance overhead. Using this technology the core modulesand plugins can be deployed in a local and distributed fashionand be setup and shutdown easily [26]. The microservice con-tainers communicate via the protocols presented in Section III-C.2. The REST APIs are built with the Flask framework [27], alightweight Python web development framework ideal for pro-totyping. The Producer/Consumer components need access tothe Docker daemon running on the Docker host in order to spinup and shutdown Docker containers containing the plugin pro-cesses. This is achieved by mounting the Docker client socket inthe Producer/Consumer containers. This gives the container rootaccess to the host, a significant security threat [28,29]. Two sam-ple plugins were implemented: Filecam a plugin that producesvideo read in from a file, and Display a plugin that forwardsmedia to the display of the local device. The plugins transmitmedia using the video streaming framework GStreamer [30].
V. MOB DETECTION
A. Dataset
Several publicly available datasets for thermal images exists[31–34]. None of these include large crowds of people, so anew dataset called the Last Post dataset was created. It consistsof thermal video captured at the Last Post ceremony in Ypres,
-
(a) Thermal view of the square. (b) Visual view of the square. (c) Thermal view of the bridge. (d) Visual view of the bridge.
Fig. 5: Last Post dataset main scenes.
Belgium [35]. The videos were captured using the Flir One Prothermal camera for Android [36] using the Iron colorscheme.Two main scenes are present in the dataset, depicted in Figure 5.Mobs are present in the thermal images, not in the visual imagesdue to the images being made on separate days. The imagesused for the experiment were manually annotated, outliers wereremoved and the dataset was randomly split in a training andvalidation set.
B. Model
Detecting and classifying objects of interest in images isknown as the object detection problem in machine learning [37].Several object detection algorithms and frameworks have beenimplemented in the past years. A distinction is made betweentraditional models [31, 38–40], deep learning two-stage net-works [41–46] and deep learning dense networks [47–49]. Thetraditional and two-stage methods make predictions relativelyslow (order of seconds on GPU) when compared to the densenetworks (order of milliseconds on GPU) [47]. Since the goalis to use the framework in real-time use cases the latter is pre-ferred. The YOLOv3 model is selected as it achieves state of theart prediction performances, can make real-time predictions andis available via the open source neural network framework dark-net [50, 51]. The model is pre-trained on the ImageNet dataset[52]. The model is trained on a NVIDIA Geforce 980 TX GPUand optimizes the SSE loss using batch gradient descent [50].To select the best weights, the average Intersection of Union(IoU) and mean Average Precision (mAP) [53] are calculatedon predictions on the validation set. The weights that achievethe highest mAP are selected as the final weights.
VI. RESULTS
A. Framework
To evaluate the framework acceptance tests for the require-ments from Section III were conducted. Common frameworkoperations such as manipulating and building a stream have anaverage execution time of 0.84 seconds with a standard devia-tion of 0.37 seconds. Less common operations such as deacti-vating a plugin, starting up the framework and shutting downthe framework have an average execution time of 3.58, 8.40 and24.02 seconds respectively with standard deviations 4.67, 0.50and 0.48 respectively. Deactivating plugins (STOP to INAC-TIVE transitions) takes a long time, as the container running theprocess needs to be removed. Real-time streaming could not betested due to the GStreamer framework having no readily avail-
able testing endpoints. However when streaming and displayinga video with the framework human users could not differenti-ate between a streaming video and a video played using a nativemedia player, making it plausible the framework streams in real-time. Great care must be taken when building plugins, as theirprocessing speed has a direct impact on the real-time streamingperformance. Interoperability is achieved with the REST APIsand plugin model presented in Section III-C. The interoperabil-ity is tested by having the framework exchange information witha mock plugin implementing the specified interface and count-ing the number of correct exchanges. The average successfulexchange ratio is 99.998%. The framework can install and de-tect new plugins at runtime, achieving runtime modifiability atplugin level. Different deployment schemes were not tested forthe prototype.
B. Mob detection
The weights generated at the 15700th training epoch achievedthe highest mAP value, 90.52% on the validation set. For com-parison performance of other models on benchmark datasetsachieve an average mAP of 74.8% [54]. The reason the model isachieving such high values is because the validation set is tem-porally correlated with the training set, as both sets are extractedfrom videos in which frames have a temporal correlation. Per-formance when predicting on new datasets will be worse. Figure6 depicts some predictions of the model. When predicting on avideo, the model generated predictions at an average frame rateof 55 frames per second an a GPU.
Fig. 6: Model predictions on validation set.
VII. CONCLUSION AND FUTURE WORK
In this dissertation a modifiable drone thermal imaging anal-ysis framework is proposed to allow end-users to build flexiblevideo processing pipelines using different thermal cameras andanalysis modules. The framework implements a microservice
-
container plugin architecture. Users can build image process-ing applications with the framework by building media streamsusing plugins that are either thermal cameras or image analy-sis software modules. The framework is evaluated by building aproof of concept implementation which is tested on the initial re-quirements. The proposed framework achieves the modifiabilityand interoperability requirements at the cost of performance andsecurity. The framework is applied for detecting large crowdsof people (mobs) during open-air events. A new dataset con-taining thermal images of such mobs is presented, on which aYOLOv3 neural network is trained. The trained model is ableto detect mobs on new thermal images in real-time achievingframe rates of 55 frames per second when deployed on a modernGPU. Some extensions to this research are: deploying a detec-tion model using the framework, testing the other deploymentconfigurations, testing the framework with end-users in prac-tice, and building new object detection models specifically forthermal images.
REFERENCES[1] R. Gade and T. B. Moeslund, “Thermal cameras and applications: a sur-
vey,” Machine Vision and Applications, vol. 25, pp. 245–262, 2014.[2] M. C. Harvey, J. V. Rowland, and K. M. Luketina, “Drone with thermal
infrared camera provides high resolution georeferenced imagery of theWaikite geothermal area, New Zealand,” 2016.
[3] S. Amici, M. Turci, S. Giammanco, L. Spampinato, and F. Giulietti, “UAVThermal Infrared Remote Sensing of an Italian Mud Volcano,” vol. 2,pp. 358–364, 2013.
[4] J. Bendig, A. Bolten, and G. Bareth, “INTRODUCING A LOW-COSTMINI-UAV FOR THERMAL-AND MULTISPECTRAL-IMAGING,”2012.
[5] A. J. Rivera, A. D. Villalobos, J. C. Monje, J. A. Mariñas, and C. M.Oppus, “Post-disaster rescue facility: Human detection and geolocationusing aerial drones,” IEEE Region 10 Annual International Conference,Proceedings/TENCON, pp. 384–386, 2017.
[6] P. Christiansen, K. A. Steen, R. N. Jørgensen, and H. Karstoft, “Auto-mated detection and recognition of wildlife using thermal cameras.,” Sen-sors (Basel, Switzerland), vol. 14, pp. 13778–93, jul 2014.
[7] S. Chowdhury, A. Emelogu, M. Marufuzzaman, S. G. Nurre, and L. Bian,“Drones for disaster response and relief operations: A continuous approx-imation model,” 2017.
[8] Workswell, “Pipeline inspection with thermal diagnostics,” 2016.[9] DJI, “Zenmuse H3 - 2D.”[10] Workswell, “Applications of WIRIS - Thermal vision system for drones.”[11] Therm-App, “Therm-App - Android-apps op Google Play,” 2018.[12] B. Satzger, W. Hummer, C. Inzinger, P. Leitner, and S. Dustdar, “Winds of
change: From vendor lock-in to the meta cloud,” IEEE Internet Comput-ing, vol. 17, no. 1, pp. 69–73, 2013.
[13] J. Divya, “Drone Technology and Usage: Current Uses and Future DroneTechnology,” 2017.
[14] B. Steffen and A. Seyfried, “Methods for measuring pedestrian density,flow, speed and direction with minimal scatter,” Physica A: Statistical Me-chanics and its Applications, vol. 389, pp. 1902–1910, may 2010.
[15] M. Wirz, T. Franke, D. Roggen, E. Mitleton-Kelly, P. Lukowicz, andG. Tröster, “Inferring crowd conditions from pedestrians’ location tracesfor real-time crowd monitoring during city-scale mass gatherings,” Pro-ceedings of the Workshop on Enabling Technologies: Infrastructure forCollaborative Enterprises, WETICE, pp. 367–372, 2012.
[16] L.-L. Slattery, “DroneSAR wants to turn drones into search-and-rescueheroes,” 2017.
[17] Amazon Web Services Inc., “What Is Amazon Kinesis Video Streams?,”2018.
[18] T. Goedeme, “Projectresultaten VLAIO TETRA-project,” tech. rep., KULeuven, Louvain, 2017.
[19] L. Bass, P. Clements, and R. Kazman, Software Architecture in Practice.Addison-Wesley Professional, 3rd ed., 2012.
[20] M. Richards, Software Architecture Patterns. O’Reilly Media, first edit ed.,2015.
[21] C. Richardson, “Microservice Architecture pattern,” 2017.[22] C. De La Torre, C. Maddock, J. Hampton, P. Kulikov, and M. Jones, “Com-
munication in a microservice architecture,” 2017.
[23] On-Net Surveillance Systems Inc., “MJPEG vs MPEG4 Understandingthe differences, advantages and disadvantages of each compression tech-nique,” 2006.
[24] D. Bull, Communicating Pictures: A Course in Image and Video Coding.Elsevier Science, 2014.
[25] Docker Inc., “Docker - Build, Ship, and Run Any App, Anywhere,” 2018.[26] D. Merkel, “Docker: Lightweight Linux Containers for Consistent Devel-
opment and Deployment,” 2014.[27] A. Ronacher, “Welcome to Flask Flask Documentation (0.12),” 2017.[28] Lvh, “Don’t expose the Docker socket (not even to a container),” 2015.[29] R. Yasrab, “Mitigating Docker Security Issues,” tech. rep., University of
Science and Technology of China, Hefei.[30] GStreamer, “GStreamer: open source multimedia framework,” 2018.[31] J. W. Davis and M. A. Keck, “A Two-Stage Template Approach to Per-
son Detection in Thermal Imagery,” Proc. Workshop on Applications ofComputer Vision, 2005.
[32] S. Hwang, J. Park, N. Kim, Y. Choi, and I. S. Kweon, “MultispectralPedestrian Detection: Benchmark Dataset and Baseline,” CVPR, 2015.
[33] Z. Wu, N. Fuller, D. Theriault, and M. Betke, “A Thermal Infrared VideoBenchmark for Visual Analysis,” IEEE Conference on Computer Visionand Pattern Recognition Workshops, 2014.
[34] S. Z. Li, R. Chu, S. Liao, and L. Zhang, “Illumination Invariant FaceRecognition Using Near-Infrared Images,” IEEE Transactions on PatternAnalysis and Machine Intelligence, vol. 29, no. 4, pp. 627–639, 2007.
[35] Last Post Association, “Mission,” 2018.[36] FLIR, “FLIR One Pro,”[37] E. Alpaydin, Introduction to machine learning. MIT Press, 3 ed., 2014.[38] F. X. F. Xu, X. L. X. Liu, and K. Fujimura, “Pedestrian detection and
tracking with night vision,” IEEE Transactions on Intelligent Transporta-tion Systems, vol. 6, no. 1, pp. 63–71, 2005.
[39] H. Nanda and L. Davis, “Probabilistic template based pedestrian detectionin infrared videos,” IEEE Intelligent Vehicles Symposium, Proceedings,vol. 1, pp. 15–20, 2003.
[40] R. Appel, S. Belongie, P. Perona, and P. Doll, “Fast Feature Pyramids forObject Detection,” Pami, vol. 36, no. 8, pp. 1–14, 2014.
[41] J. R. R. Uijlings, K. E. A. Van De Sande, T. Gevers, and A. W. M. Smeul-ders, “Selective Search for Object Recognition,” tech. rep., 2012.
[42] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Region-Based Convolu-tional Networks for Accurate Object Detection and Segmentation,” IEEETransactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 1,pp. 142–158, 2014.
[43] R. Girshick, “Fast R-CNN,” in Proceedings of the IEEE InternationalConference on Computer Vision, vol. 2015 Inter, pp. 1440–1448, 2015.
[44] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Trans-actions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6,pp. 1137–1149, 2016.
[45] K. He, Gkioxari, P. Dollár, and R. Girshick, “Mask R-CNN,” arXiv, 2018.[46] J. Dai, Y. Li, K. He, and J. Sun, “R-FCN: Object Detection via Region-
based Fully Convolutional Networks,” tech. rep., 2016.[47] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look
Once: Unified, Real-Time Object Detection,” 2015.[48] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-y. Fu, and A. C.
Berg, “SSD : Single Shot MultiBox Detector,” arXiv, 2016.[49] T.-y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal Loss for
Dense Object Detection,” arXiv, 2018.[50] J. Redmon and A. Farhadi, “YOLOv3: An Incremental Improvement,”
axXiv, 2018.[51] J. Redmon, “Darknet: Open source neural networks in c.”
http://pjreddie.com/darknet/, 2013–2016.[52] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “ImageNet:
A Large-Scale Hierarchical Image Database,” in CVPR09, 2009.[53] M. Everingham, S. M. A. Eslami, L. Van Gool, C. K. I. Williams, J. Winn,
and A. Zisserman, “The Pascal Visual Object Classes Challenge: A Ret-rospective,” International Journal of Computer Vision, vol. 111, no. 1,pp. 98–136, 2014.
[54] A. Ouaknine, “Review of Deep Learning Algorithms for Object Detec-tion,” 2018.
-
xii
Contents
1 Introduction 1
1.1 Drones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 Thermal Cameras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.2 Aerial thermal imaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3.1 Industry adoption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3.2 Crowd monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3.3 Goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3.4 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 System Design 5
2.1 Requirements analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1 Functional requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.2 Non-functional requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Patterns and tactics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.1 Layers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.2 Event-driven architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.3 Microkernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.4 Microservices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.5 Comparison of patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Software architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.1 Static view . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.2 Dynamic views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.3 Deployment views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
-
3 State of the art and technology choice 27
3.1 Thermal camera options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.1.1 Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.1.2 Comparative analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.2 Microservices frameworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2.1 Flask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2.2 Falcon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2.3 Nameko . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2.4 Vert.x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2.5 Spring Boot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3 Deployment framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3.1 Containers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3.2 LXC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.3.3 Docker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.3.4 rkt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.4 Object detection algorithms and frameworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.4.1 Traditional approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.4.2 Deep learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.4.3 Frameworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.5 Technology choice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.5.1 Thermal camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.5.2 Microservices framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.5.3 Deployment framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.5.4 Object detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4 Proof of Concept implementation 43
4.1 Goals and scope of prototype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.2 Overview of prototype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.2.1 General overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.2.2 Client interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.2.3 Stream . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.2.4 Producer and Consumer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.2.5 Implemented plugins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.3 Limitations and issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.3.1 Single client . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.3.2 Timeouts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.3.3 Exception handling and testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
-
4.3.4 Docker security issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.3.5 Docker bridge network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.3.6 Single stream . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.3.7 Number of containers per plugin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5 Mob detection experiment 53
5.1 Last Post thermal dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.1.1 Last Post ceremony . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.1.2 Dataset description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.2 Object detection experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.2.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.2.2 Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6 Results and evaluation 58
6.1 Framework results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.1.1 Performance evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.1.2 Interoperability evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
6.1.3 Modifiability evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.2 Mob detection experiment results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.2.1 Training results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.2.2 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.2.3 Validation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
7 Conclusion and future work 67
7.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
7.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
7.2.1 Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
7.2.2 Implementing a detection plugin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
7.2.3 Different deployment configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
7.2.4 Multiple streams with different layouts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
7.2.5 Implementing the plugin distribution service (Remote Producer/Consumer) . . . . . . . . . . . . 70
7.2.6 Using high performance microservices backbone frameworks . . . . . . . . . . . . . . . . . . . . 70
7.2.7 New object detection models and datasets specifically for thermal images . . . . . . . . . . . . . 70
A Firefighting department email conversations 81
A.1 General email sent to Firefighting departments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
A.2 Conversation with Firefighting department of Antwerp, Belgium . . . . . . . . . . . . . . . . . . . . . . . 82
A.3 Converstation with Firefighting department of Ostend, Belgium . . . . . . . . . . . . . . . . . . . . . . . . 83
-
A.4 Conversation with Firefighting department of Courtrai, Belgium . . . . . . . . . . . . . . . . . . . . . . . . 83
A.5 Conversation with Firefighting department of Ghent, Belgium . . . . . . . . . . . . . . . . . . . . . . . . . 83
B Thermal camera specifications 85
C Last Post thermal dataset summary 94
C.1 24th of March 2018 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
C.2 2nd of April 2018 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
C.3 3th of April 2018 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
C.4 4th of April 2018 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
C.5 5th of April 2018 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
C.6 9th of April 2018 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
C.7 10th of April 2018 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
C.8 11th of April 2018 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
C.9 12th of April 2018 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
-
xvi
List of Figures
2.1 Use case diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Overview of the framework software architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3 Framework network topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4 Client Interface detailed view . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.5 Stream detailed view . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.6 Stream model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.7 Plugin model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.8 Plugin state transition diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.9 Component-connector diagrams of the Producer and Consumer module. . . . . . . . . . . . . . . . . . . . 21
2.10 Producer and Consumer Distribution component-connector diagrams. . . . . . . . . . . . . . . . . . . . . 22
2.11 Add plugin sequence diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.12 Link plugins sequence diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.13 Deployment diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.1 Thermal image and MSX image of a dog. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.3 Rethink IT: Most used tools and frameworks for microservices results [54] . . . . . . . . . . . . . . . . . . 32
3.4 Containers compared to virtual machines [66] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.1 filecam GStreamer pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.2 local plugin GStreamer pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.1 Last Post ceremony panorama . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.2 Last Post filming locations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.3 Main scenes in the Last Post dataset. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.4 Outliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.1 Average training loss per epoch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6.2 Validation metrics per epoch. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6.3 Predictions of the model on images in the validation set. . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
-
7.1 GStreamer pipeline for a plugin with a detection model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
-
xviii
List of Tables
2.1 Performance utility tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Interoperability utility tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3 Modifiability utility tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 Usability utility tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.5 Security utility tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.6 Availability utility tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.7 Architecture pattern comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
6.1 Acceptance tests results summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.2 Performance test statistics summary, measured in seconds . . . . . . . . . . . . . . . . . . . . . . . . . . 60
6.3 Resource usage of the framework in several conditions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
6.4 Total size of framework components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
6.5 Interoperability tests results (S.: Source, L.: Listener) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
B.1 Compared cameras, their producing companies and their average retail price. . . . . . . . . . . . . . . . . 86
B.2 Physical specifications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
B.3 Image quality IR: InfraRed, SD: Standard, FOV: Field of View . . . . . . . . . . . . . . . . . . . . . . . . . . 88
B.4 Thermal precision. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
B.5 Interfaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
B.6 Energy consumption. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
B.7 Help and support. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
B.8 Auxiliary features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
-
xix
List of Listings
1 Minimal Flask application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2 Vert.x example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3 Spring Boot example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4 docker-compose.yml snippet of the prototype. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5 Mounting the Docker socket on the container . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
6 Starting a plugin container. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
7 Dynamic linking of the decodebin and jpegenc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
-
xx
List of Abbreviations
ACF Aggregated Channel Features
AMQP Advanced Message Queuing Protocol
API Application Programming Interface
AS Availability Scenario
ASR Architecturally Significant Requirement
CLI Command Line Interface
CNN Convolutional Neural Networks
CRUD Create Read Update Destroy
DNS Domain Name System
FR Functional Requirement
GPU Graphical Processing Unit
H High
HTTP Hyper Text Transfer Protocol
ICF Integral Channel Features
IoU Intersection of Union
IS Interoperability Scenario
IT Interoperability Tactic
JVM Java Virtual Machine
L Low
-
xxi
LXC Linux Containers
M Medium
mAP mean Average Precision
Motion-JPEG MJPEG
MS Modifiability Scenario
MSX Multi Spectral Dynamic Imaging
MT Modifiablity Tactic
NFR Non-Functional Requirement
ONNX Open Neural Network Exchange Format
OS Operating System
PS Performance Scenario
PT Performance Tactic
QAR Quality Attribute Requirement
REST Representational State Transfer
RNN Recurrent Neural Network
RPN Region Proposal Network
RTP Real-time Transport Protocol
SS Security Scenario
SSE Sum of Squared Errors
SVM Support Vector Machine
TCP Transmission Control Protocol
UDP User Datagram Protocol
UI User Interface
US Usability Scenario
YOLO You Only Look Once
-
INTRODUCTION 1
Chapter 1
Introduction
Throughout history, having an overview of the environment from high viewpoints held many benefits. Early civilizations used
hills to monitor their surroundings, population and spot possible attackers. The discovery of flight meant that environments
could now be viewed from a bird’s-eye view, offering even more visibility, revealing much more of the world below. Recently a
much more smaller type of aircraft was developed: the drone. Ranging from large plane-like to almost insect-like devices and
having a wide variety of uses, drones are quickly taking over the sky. Drones would not be as effective without proper cameras
providing a detailed view on the world below. With digital videocameras offering superb quality for steadily decreasing costs,
almost every scene can be captured in great detail. However, these cameras are limited to the visible light spectrum, which
hinders drones to operate in all circumstances like nightly flights. Thermal cameras measure the emitted heat of a scene and
can reveal information not visible to the eye, such as hidden persons or animals, pipelines, malfunctioning equipment, etc. The
combination of these two technologies certainly holds many exciting opportunities for the future.
1.1 Drones
Drones are flying robots that can fly remotely or autonomously and don’t carry a human operator. They can carry a variety of
payloads: video cameras, delivery parcels, fluid containers, sensors, lights but also lethal explosives [1].
Drones are classified in different categories based on varying parameters such as the physical characteristics (diameter, weight),
aerial movement techniques, application domains, etc. Based on diameter, drones are classified as smart dust (1 mm to 0.25
cm), pico air vehicles (0.25 cm - 2.5 cm), nano air vehicles (2.5 cm - 15 cm), micro air vehicles (15 cm - 1 m), micro unmanned
aerial vehicles (1 m - 2 m), and unmanned aerial vehicles (2 m and larger). Often depending on their diameter the weight
of these devices ranges from less than a gram up to more than 2000 kg. Drones have different flight techniques such as
propulsion engines with wings, rotors in various amounts, flapping wings and even balloons. They are used for all kinds of
purposes, ranging from search and rescue missions, environmental protection, delivery, recon, etc. Hassanalian et al. provide
an excellent overview of most types of drones [2].
Due to the increasing interest in commercial drone platforms [3], a variety of payloads were developed, specifically tailored for
these aerial robots such as gimbals to mount action video cameras [4], gimbals for delivering packets [5] and thermal imaging
-
1.2 Concepts 2
platforms [6].
1.2 Concepts
1.2.1 Thermal Cameras
Thermal cameras are passive sensors that capture the infrared radiation emitted by all objects with a temperature above
absolute zero degrees Kelvin. In contrast to visible light cameras, thermal cameras do not depend on an external energy
source for visibility and colors of objects or scenes. This makes captured images independent of the illumination, colors etc.
Furthermore, images can be captured in the absence of visible light [7]. Originally thermal camera technology was developed
for night vision purposes for the military and the devices were very expensive. Later, the technology was commercialized
and has developed quickly over the last few decades, resulting in better and cheaper cameras [7]. This led to access for a
broader public, and the technology is now introduced to a wide range of different applications, such as building inspection, gas
detection, industrial appliances, medicinal science, agriculture, fire detection, surveillance, etc. [7] Thermal cameras are now
being mounted on drones to give an aerial thermal overview.
1.2.2 Aerial thermal imaging
Aerial thermal imaging is defined as the creation of thermal images using a flying device. This dissertation focuses on the usage
of drones for aerial thermal imaging. There are many applications for aerial thermal imaging. Some examples are: geography
[8, 9], agriculture [10, 11], search and rescue operations [12], wildlife monitoring [13], forest monitoring [14, 15], disaster response
[16], equipment and building maintenance [17–20], etc. In the past few years, several industry players have developed thermal
cameras specifically aimed at these drone applications. Examples are FLIR [6], Workswell [21] and TEAX Technology [22].
1.3 Problem statement
1.3.1 Industry adoption
The implementation of thermal cameras on drone platforms faces some issues for wide adoption by the industry. Several
vendors offer thermal camera products, some specifically designed for drone platforms, that often implement different image
formats, color schemes and interfaces (e.g. [23–25]). This leads to issues if users want to modify their applications by changing
the camera that is used because the applicationmust implement new software to interact with the camera, or when the camera
is no longer supported by the vendor, leaving the application with outdated hardware and software. This leads to a problem
called vendor lock-in that makes customers dependent on a certain vendor as they cannot switch product without making
substantial costs, a problem already very tangible for cloud-based applications today [26].
Applications across various fields often have different functional and non-functional requirements Some applications have hard
real-time deadlines (such as firefighting, search and rescue, security, etc.) that must be respected, other applications require
-
1.3 Problem statement 3
highly accurate predictions (e.g. person detection, agriculture, etc.). A single application domain can even have many different
use cases.
Several firefighting departments in Belgium were contacted to get an overview of their needs for a drone thermal imaging ap-
plication. It quickly became clear they had various detection problems, such as finding missing persons, locating hot explosives,
measuring temperatures in silos, detecting invisible methane fires, etc. Equipment also wears down more quickly due to usage
in harsh environments such as fires in close proximity. A drone thermal application for them needs to be able to exchange
functionality and hardware easily and have high performance constraints to deliver value for them. The email conversations
can be read in Appendix A.
Other drone thermal imaging applications are mostly only used in the niche domain for which they were developed, because
they aren’t designed for flexibility [27]. These proprietary applications have some disadvantages: the development and support
potentially has a large cost, vendor lock-in can occur when products are no longer supported, security issues could arise and
customization is difficult [28, 29]. Applications could benefit from a backbone framework to aid in this modifiability/interoper-
ability issue, aiding in developing end-to-end solutions connecting thermal cameras to various analysis/detection modules for
various use cases.
1.3.2 Crowd monitoring
Festivals and other open air events are popular gatherings that attract many people. For every event organizer it is important to
ensure safety and avoid incidents. Large groups of people, so-called mobs can create potentially dangerous situations through
bottlenecks, blocking escape routes, etc. Therefore having the ability to monitor crowds and predict their behavior is very
important to avoid such scenarios. Data can be obtained by evaluating video footage from past comparable events or real time
video monitoring of current events [30]. By analyzing this footage potentially dangerous situations can be avoided by acting
on the mob formation, and safety regulations can be improved to help planning future events. Vision-based approaches face
several limitations: mounted cameras cannot capture elements outside of their field of view, can’t see in some conditions (for
example during night time) and it is difficult to infer information from the raw footage [31].
Thermal cameras could help for crowd monitoring because they can operate in any condition. Having precise and detailed
object recognition for the images produced by these cameras is crucial to extract information correctly. In this context clas-
sifying images is not satisfactory, localization of the objects contained within the images is needed. This problem is known
as object detection [32]. There are several challenges for object detection in thermal images: the image quality is very low
when compared to visible light images, there is a lack of color and texture information, and temperature measures are relative
measures, etc. This makes extracting discriminative information from these images difficult [33]. Most efforts towards object
detection on thermal images has gone towards human detection. Most of the proposed algorithms focus on feature extraction
using the Aggregated Channel Features technique and boosting algorithms for learning [33–35]. Novel approaches make use
of so-called deep learning with neural networks that achieve very good results, given enough data [36].
-
1.4 Outline 4
1.3.3 Goal
The goal of the dissertation is to explore the requirements of the possible backbone framework suggested in Section 1.3.1 and its
potential software architecture. The architecture is evaluated by building a proof of concept implementation of the framework
and evaluating it against the proposed requirements. To verify its use in developing drone thermal imaging applications the
specific mob-detection use case is investigated.
1.3.4 Related work
The Irish start-up DroneSAR [37] developed a search-and-rescue (SAR) drone platform allowing users to stream live images
and video from a drone as it conducts a search for missing persons. The platform works with any camera, visual and thermal
but focuses on drones from vendor DJI, DroneSARs industry partner. Amazon introduced the Amazon Kinesis Video Streams
platform in January 2018 as a new service for the Amazon Web Services (AWS) cloud platform. It allows users to stream live
video from devices to the AWS cloud and build applications for real-time video processing [38]. The VIPER project by EAVISE,
KU Leuven researched how thermal and visual video images could be used for real-time detection of persons using object
detection algorithms such as deep learning [36]. The framework presented in this work combines elements from all three of
these examples.
1.4 Outline
The remainder of this dissertation is organized as follows. Chapter 2 presents the requirements for the framework and the
software architecture. Chapter 3 explores several state of the art technologies that can serve as backbone technologies for
the framework. To test the viability of the software architecture, a prototype is implemented. Chapter 4 presents the different
aspects of this prototype. Chapter 5 describes the experiment that is conducted to research the detection of mobs in thermal
images. The results of both the framework and the detection experiment are presented and evaluated in Chapter 6. Finally the
conclusion and future research efforts are presented in Chapter 7.
-
SYSTEM DESIGN 5
Chapter 2
System Design
Finding out what users actually expect from a software system and what makes it valuable for them is of key importance for the
success of that system. This chapter first explores the functional and non-functional requirements of the hypothetical frame-
work suggested in Chapter 1, to find out what makes building the framework worthwhile. Well known architectural patterns
enable certain software requirements very well and can be used for building the software architecture of the framework. The
framework software architecture combines some of these patterns and is presented in several documents.
2.1 Requirements analysis
Requirements are the stated life-cycle customer needs and objectives for the system, and they relate to how well the system
will work in its intended environment. They are those aspects of the framework that will provide value to the users.
2.1.1 Functional requirements
Functional requirements (FR) describe the necessary task, action or activity that must be accomplished by the system , often
captured in use cases and/or user stories [39, 40]. Use cases provide a summary of the features described in the user stories.
Several external people and/or systems, defined as actors, interact with the framework to achieve a certain goal [40]. Three
actors are identified for the framework: an end-user that uses the framework in order to build an image processing applica-
tion for a specific use case such as the ones described in Section 1.2.2, a camera developer who creates support software for
a specific thermal camera for the framework so that the end-user can buy and use their product, and an analysis software
developer that creates analysis software for a specific use case (tracking object, detecting objects, etc.) so that the end-user
can use their software to build his 1 application. The camera and analysis software developers are generalized to an actor called
plugin developer, who develops plugins to extend the functionality of the framework. These plugins are the building blocks
with which the end-user can build image processing applications.
The general user scenario for the framework proceeds as follows. An end-user wants to build an image processing application,
1To avoid unnecessary gender specific pronoun clutter, the male pronoun is used by default.
-
2.1 Requirements analysis 6
e.g. to detect fires in a landscape using a drone. He has a thermal camera for this and has read about hot-spot detection in
video. The user looks for a plugin for the framework that can read video from his thermal camera and for a plugin that does
the hot-spot detection. If the user finds these plugins, he can add them to the framework and use them for the application he
is building. He connects both plugins with the framework in a specific order to finish his application. For this simple example,
the thermal camera plugin is connected to the hot-spot detection plugin so that video coming from the thermal camera is
transmitted to the detection plugin to find the fires in the landscape. The plugins in the application and the specific order in
which they are connected is defined as a stream. This stream should be easily modifiable if additional or other functionalities
are required. Suppose that the thermal camera delivers very high quality video and the detector cannot work with this as it
can only operate on low quality images. The end-user searches for a plugin that scales the high quality video down to an
accepted quality for the detector. This plugin is placed in between the thermal camera and the detector, and the application
can work again. By continuously adding plugins to the framework, the number of possible applications that can be built with
the framework increase, making the framework useable for more aerial thermal imaging use cases.
Instead of developing the application from scratch, users can use the already implemented plugins to build the applications in
an ad hoc fashion. Because of this, the development time for such applications can be reduced and users can switch hardware
and/or algorithms easily. The FRs are summarized in a use case diagram that connects each actor with their respective require-
ments and the relationship among them [40], depicted in Figure 2.1. Trivial functionalities such as launching and shutting down
the framework are omitted. The red use cases represent use cases to extend the functionality of the framework, the blue use
cases represent use cases for building streams, white use cases modify the media processing of the stream. Some use cases
depend on others; the blue and white use cases work with plugins of the framework, their prerequisite use case is ”Add plugin”
as a plugin must be a part of the framework for a user to use it, the ”(Un)Link plugins”, ”Stop/Pause/Play stream” use cases
depend on ”Add plugins to stream”, as a stream must contain plugins before they can be manipulated.
2.1.2 Non-functional requirements
A non-functional requirement (NFR) specifies how the framework is supposed to be, or in what manner it should execute its
functionality [41]. These qualifications typically cover business and system quality requirements. A distinction is made between
quality attribute requirements (QAR) and constraints. QARs are qualifications of the FRs or of the overall product, e.g. how
fast a certain function must be executed or how resilient it must be to erroneous input. They are closely related to business
requirements, which are specifications that once delivered, provide value to the actors [40]. The QARs are captured in a utility
tree [40] that has a root node representing the system. This root node is elaborated by listing the major QARs that the system
is required to exhibit. Each QAR is subdivided into more specific QARs. To make the specific requirements unambiguous and
testable, a scenario for the system or a specific function is written and they are evaluated against the business value and the
architectural impact [40]. The QAR can either have High (H) , Medium (M) and Low (L) business value and architectural impact
respectively. The business value is defined as the value for the end user if the QAR is enabled. High designates a must-have
requirement. Medium is for a requirement which is important but would not lead to project failure. Low describes a nice to have
QAR but not something that is worth much effort. Architectural impact defines how much the architecture must be designed
towards the QAR to enable it. High means that meeting this QAR will profoundly affect the architecture. Medium means
-
2.1 Requirements analysis 7
Figure 2.1: Use case diagram.
that meeting this QAR will somewhat affect the architecture. Low means that meeting this QAR will have little effect on the
architecture. The following QARs are discussed: performance, interoperability, modifiability, usability, security and availability.
Performance
Performance indicates the frameworks ability to meet timing requirements. It characterizes the events that can occur and the
frameworks time-based response to those events. Latency is defined as the time between the arrival of the stimulus and the
system’s response to it [40]. The system latency is the latency between the initialization of an action sequence and the first
change to the system noticeable by the user. Streaming latency is defined as the time between the arrival of a video frame and
the arrival of the next video frame. The jitter of the response is the allowable variation in latency. Scalability is the number
of users that can use the framework at the same time. The utility tree is presented in Table 2.1. The software industry has
not defined a quantified ’good latency’ for end-users, but a 4 second latency rule is often used as a rule-of-thumb [42]. The
average response time for general framework commands should then be less than 2 seconds, with a standard deviation of 1
seconds, ensuring most execution times respect the 4 second bound. As stated in Chapter 1, some use cases require real-time
video streaming, such as fire fighting. The notion of low latency real-time video loosely defines that video should be streamed
almost simultaneously; if a camera is filming and a human user does not notice a latency between the video of the camera
and the real world, the video stream is considered real-time. Real-time is thus a human time perception, and for visual inputs
this bound is as low as 13 milliseconds. Anything above 13 milliseconds becomes noticeable, anything above 100 milliseconds
hinders human performance [43, 44]. However, the framework focusses on the use of thermal cameras most of which most
don’t produce frames faster than 8 frames per second, or 125 milliseconds per frame (see Section 3.1). More expensive cameras
-
2.1 Requirements analysis 8
can shoot at 25 frames per second, corresponding to a latency of 40 milliseconds, and this bound is selected for the streaming
latency, with a standard deviation of 20 milliseconds, remaining below the frame rate of less expensive cameras. The number
of users that can use the framework at the same time is assumed to be low, as current aerial thermal image applications are
currently operated by only one user or a few. The assumption is that a maximum of five users can use the framework at the
same time. All of these requirements are quantified as relatively ’good’ values. These bounds should be evaluated for user
satisfaction by having users use a prototype of the framework in practice.
Attribute refinement Id Quality attribute scenario
LatencyPS-1 The average execution time of all framework commands does not exceed 2 sec-
onds. (H, M)
PS-2 A playing stream should have an upper limit of 40ms streaming latency. (H, H)
JitterPS-3 The average standard deviation of the execution time of all framework com-
mands should not exceed 1 second under normal operation. (H, M)
PS-4 The average standard deviation in streaming latency should not exceed 20ms
under normal operation. (H, H)
Scalability PS-5 The system should usable by five users at the same time. (M, M)
Table 2.1: Performance utility tree
Interoperability
Interoperability is the degree to which two or more independently developed systems can usefully exchange meaningful infor-
mation via interfaces in a particular context [40]. The framework will interoperate with cameras and analysis modules via the
framework plugins. Henceforth the camera and analysis plugins will be referred to as a Producer plugin and a Consumer plugin.
A Producer plugin is thus a plugin that represents a camera that produces video and a Consumer plugin a plugin that represents
a module that processes or consumes video. The framework will thus interact with the Producer and Consumer plugins, with
which the framework exchanges requests to link them together, control their media process, etc. The more correct exchanges
there are between the two, the better the user can use the plugin for building applications with the framework. This QAR is
quantified by the ratio of requests made by the framework that were correctly interpreted by the plugin (successful requests)
and the total number of requests during a runtime of the framework [40]. Intuitively one argues that the framework must
achieve perfect interoperability with a perfect exchange success rate of 100%. Reality however tends to not agree with perfec-
tion and it can never be excluded that exchanges will always be correct. Therefore it is better to aim for a good interoperability
measure and prepare for failed exchanges instead of naively assuming the framework will be perfect. An exchange success
rate of 99.99% is selected, the motivation for this bound is as follows. A plugin is assumed to be always correct up to first the
mistake, after which the plugin is faulty and the fault needs to be identified and ensured that it won’t occur again. An exchange
success rate of 99.99% means that if 10000 plugins are installed and used by the framework, only one will fail during uptime.
For one plugin during framework up time, the mean time between failures is then 10000 exchanges. It is suspected that this
-
2.1 Requirements analysis 9
amount of exchanges are very high for normal framework use. Because the possibility of faulty exchanges is acknowledged,
the framework will need to implement a fallback mechanism to compensate. The utility tree is presented in Table 2.2.
Attribute refinement Id Quality attribute scenario
Syntactic interoperabilityIS-1 The framework exchanges requests with a Producer plugin (known at runtime),
with a success ratio of 99.99%. Incorrect requests are undone by the framework
and logged. (H,H)
IS-2 The framework exchanges requests with a Consumer plugin (known at runtime),
with a success ratio of 99.99%. Incorrect requests are undone by the framework
and logged. (H,H)
Table 2.2: Interoperability utility tree
Modifiability
Modifiability is the cost and risk of changing functionality of the system [40]. One of themost important values of the framework
is modifiability of the supported thermal cameras and analysis modules. The framework needs to be extendable for new
functionalities by enabling developers to add their support software in the form of a plugin. End-users should be able to
modify the components that they use for their image processing applications easily and quickly to allow for interchangeable
hardware and software, and quickly set up new applications. Modifiability is defined in two environments: runtime defined as
periods during which the system is up and running and downtime defined as the time periods during which the system is not
active. The utility tree is presented in Table 2.3.
To enable users to choose the extensions they need, the framework will need a distribution service that contains all plugins
available for the framework, from which a user can select and install plugins for their local version of the framework. Adding
new plugins to the distribution service should not affect versions of the frameworks installed by the user. When a user adds a
plugin from the distribution to his version of the framework, the framework should only reload once before making the plugin
useable for user comfort. Deployability is defined as the different device configurations that specify how the framework can be
deployed. If the framework can be deployed in different fashions this can increase the value for the end-user. Suppose a fire
fighting use case in which a forest fire is monitored on site. Computationally powerful devices might not be available on site,
so moving some plugins processing media to a remote server or cloud could still allow usage of the framework. Perhaps the
device processing the media is already remote, for example a drone on security patrol, in this case access via a remote device
such as a smartphone is desirable. This leads to the deployment configurations described in the utility tree.
Usability
Usability indicates how easy it is for the user to accomplish a desired task and the kind of user support the system provides.
Learnability indicates how easy it is for a user to gain knowledge on how to use the framework. Errors are the amount of errors
-
2.1 Requirements analysis 10
Attribute refinement Id Quality attribute scenario
Run time modifiability
MS-1 Support for a new Producer plugin should be added to the distribution service
within one day, without the framework having to restart. (H, H)
MS-2 Support for a new Consumer plugin should be added to the distribution service
within one day, without the framework having to restart. (H, H)
MS-3 End-users should be able to extend their framework with new functionalities
by installing new Consumer and Producer Plugins. (H,H)
MS-4 End-users should be able tomodify the plugins used to build their stream. (H,H)
Down time modifiabilityMS-5 New Producer plugins can be installed to the local framework at runtime, the
framework can only reload once before the plugin is useable. (H, H)
MS-6 New Producer plugins can be installed to the local framework at runtime, the
framework can only reload once before the plugin is useable. (H, H)
Deployability
MS-7 The system should be deployable on a combination of a smartphone and
cloud/remote server environment. (H, H)
MS-8 The system should be deployable on a personal computer or laptop. (H, H)
MS-9 The system should be deployable on a smartphone, laptop and cloud environ-
ment. (H, H)
Table 2.3: Modifiability utility tree
a user can make when trying to execute certain functions [40]. The utility tree is presented in Table 2.4.
Security
Security is a measure of the system’s ability to protect data and information from unauthorized access while still providing
access to users and systems that are authorized. An action taken against the system to cause it harm is called an attack.
Security has three main characteristics. Confidentiality is the property that data or services are protected from unauthorized
access. Integrity is the property that data or services are protected from unauthorized manipulation. Availability is the property
of the systemmaintaining its functionality during an attack. Authentication verifies the identities of the parties of an interaction,
checks if they are truly who they claim to be and gives or provokes access [40]. Security is important for the framework if it is
deployed on multiple devices that use a public network to communicate. The utility tree is presented in Table 2.5.
Availability
Availability in a general context (not only security) refers to howavailable the software is to carry out its functionality. Downtime
is a measure of the time that the system is unavailable to carry out its functions. The utility tree is presented in Table 2.6.
Availability is specified for the part of the framework that distributes the plugins.
-
2.2 Patterns and tactics 11
Attribute refinement Id Quality attribute scenario
Learnability
US-1 A user should be able to learn how to build an image processing application in
at most one hour. (H, L)
US-2 An experienced developer should be able to start developing a Consumer plugin
for the system within one day. (H, L)
US-3 An experienced developer should be able to start developing a Producer plugin
for the system within one day. (H, L)
Errors US-4 A user should not make more than 3 errors to build an image processing appli-
cation. (H, L)
Table 2.4: Usability utility tree
Attribute refinement Id Quality attribute scenario
Confidentiality SS-1 Streams created by a user can only be accessed by that user and not by any
other entity. (H, L)
Integrity SS-2 Streams can’t be manipulated without authorization by the user that made the
streams. (H, L)
Availability SS-3 During an attack, the core functionality is still available to the user. (H, M)
AuthenticationSS-4 Users should authenticate with the system to perform functions. (H, L)
SS-5 Developers should authenticate their plugins before adding them to the frame-
work. (H, L)
Table 2.5: Security utility tree
Architecturally significant requirements
Architecturally Significant Requirements (ASR) are the requirements that are themost important to realize according to business
value and have the most impact on the architecture. From the utility trees and the measures of the quality attribute scenarios,
the QARs modifiability, interoperability and performance are identified as ASRs.
2.2 Patterns and tactics
An architectural pattern is a package of design decisions that is found repeatedly in practice, that has known properties that
permit reuse and describe a class of architectures. Architectural tactics are simpler than patterns which typically use just a
single structure or computational mechanism. They are meant to address a single architectural force. Tactics are the ”building
blocks” of design and an architectural pattern typically comprises one or more tactics [40]. Based on the ASRs several tactics
are listed in Table 2.7 that are used for the base pattern selection. The explored patterns are: layers, event-driven architecture,
-
2.2 Patterns and tactics 12
microkernel and microservices.
2.2.1 Layers
The layered pattern divides the software into units called layers, that each perform a specific role within the application. Each
layer is allowed to use the layer directly beneath it via its interface. Changes in one layer are isolated if the interfaces don’t
change, enablingMT-1 andMT-2,MT-5 [40]. While changes can be isolated by the isolated layers, they remain difficult due the
monolithic nature of most implementations of this pattern. Layers contribute to a performance penalty due to the ”architecture
sinkhole phenomenon” in which requests are simply propagating through layers for the sake of layers [45].
2.2.2 Event-driven architecture
This pattern consists of several event publishers that create events and event subscribers that process these events. The pub-
lishers and subscribers are decoupled by using an event channel to which the publishers publish events that the event channel
forwards to the event subscribers. The subscribers should have a single purpose and execute asynchronously. Since the publish-
ers and subscribers are single-purpose and are completely decoupled from other components via the event channel, changes
are isolated to one or some components, enabling MT-1, MT-2 , MT-4, MT-5 and MT-7. If the event ch