Fusion of Federated Learning and Industrial Internet of ...

24
1 Fusion of Federated Learning and Industrial Internet of Things: A Survey Parimala M, Swarna Priya R M, Quoc-Viet Pham, Kapal Dev, Praveen Kumar Reddy Maddikunta, Thippa Reddy Gadekallu, and Thien Huynh-The Abstract—Industrial Internet of Things (IIoT) lays a new paradigm for the concept of Industry 4.0 and paves an insight for new industrial era. Nowadays smart machines and smart factories use machine learning/deep learning based models for incurring intelligence. However, storing and communicating the data to the cloud and end device leads to issues in preserving privacy. In order to address this issue, federated learning (FL) technology is implemented in IIoT by the researchers nowa- days to provide safe, accurate, robust and unbiased models. Integrating FL in IIoT ensures that no local sensitive data is exchanged, as the distribution of learning models over the edge devices has become more common with FL. Therefore, only the encrypted notifications and parameters are communicated to the central server. In this paper, we provide a thorough overview on integrating FL with IIoT in terms of privacy, resource and data management. The survey starts by articulating IIoT characteristics and fundamentals of distributive and FL. The motivation behind integrating IIoT and FL for achieving data privacy preservation and on-device learning are summarized. Then we discuss the potential of using machine learning, deep learning and blockchain techniques for FL in secure IIoT. Further we analyze and summarize the ways to handle the heterogeneous and huge data. Comprehensive background on data and resource management are then presented, followed by applications of IIoT with FL in healthcare and automobile industry. Finally, we shed light on challenges, some possible solutions and potential directions for future research. Index Terms—Data Storage, IIoT, Federated Learning, Data Privacy, Data sharing, Resource Management. I. I NTRODUCTION With an unprecedented increase in the number of Internet of things (IoT) devices and emerging applications, a large amount of traffic is created every day. Such an increase poses a great burden on the Internet network and also demands significant investments for the infrastructure upgrade. However, thanks to the development of big data analytics and artificial intelligence (AI) techniques such as deep learning (DL) and machine learning (ML), the data collected can be effectively exploited for many purposes. From the communications perspective, Parimala M, Swarna Priya R. M, Praveen Kumar Reddy Maddikunta and Thippa Reddy Gadekallu are with the School of Information Tech- nology and Engineering, Vellore Institute of Technology, Vellore 632014, India (e-mail: [email protected], [email protected], praveenku- [email protected], [email protected]). Quoc-Viet Pham is with the Research Institute of Computer, Information and Communication, Pusan National University, Busan 46241, Republic of Korea (e-mail: [email protected]). Kapal Dev with CONNECT Centre, Trinity College Dublin, D02 PN40, Ireland, (e-mail:[email protected]). Thien Huynh-The is with the ICT Convergence Research Center, Kumoh National Institute of Technology, Gyeongsangbuk-do 39177, Republic of Korea (e-mail: [email protected]). TABLE I ACRONYMS IoT Internet of things IIoT Industrial Internet of Things DL Deep Learning ML Machine Learning FL Federated Learning DP Differential Privacy FML Federated Machine Learning FDL Federated Deep Learning AI Artificial Intelligence DRL Deep Reinforcement Learning DML Distributed Machine Learning FTM Federated Tensor Mining IoE Internet of Everything RFID Radio-frequency data SDN Software Defined Network BF Bloom Filter FFBF Fuzzy Folded BF IDMS Industrial Data Management System SOA Service Oriented Architecture IWNs Industrial Wireless Networks CCPSA Cyber Physical System Architecture CPS Cyber Physical Systems MTs Mobile Terminals MPC Multi-party Computation PEFL Privacy Enhanced Federated Learning PSO Particle Swarm Optimization BS Base station MBS Macro Base Station SBS Small Cell Base Station IFL Industrial Federated Learning DTWN Digital Twin Wireless Networks DFRL Deep Federated Reinforcement Learning DFQL Deep Federated Q-Learning MAQL Multi-Agent Deep Q-learning UAVs Unmanned Aerial Vehicles GFC Ground Fusion Centre LSTM Long-Short Term Memory FIL Federated Imitation Learning the last few years has witnessed the emergence of AI ap- plications in various fields. For example, ML is employed to investigate an efficient antenna selection in multi-antenna wireless systems [1], DL is used to handle the computation offloading problem in IoT systems with edge computing [2], and deep reinforcement learning (DRL) is used to optimize resource allocation problems at the network edge such as traffic classification, edge caching, network security, and data offloading [3]. However, the conventional AI models typically require central processing of data collected from all the users in the network, that is, users should upload their own data to a central server for training the learning model. However, a critical concern with the central learning is data privacy, i.e., some users want to keep track of their local data and arXiv:2101.00798v1 [cs.NI] 4 Jan 2021

Transcript of Fusion of Federated Learning and Industrial Internet of ...

Page 1: Fusion of Federated Learning and Industrial Internet of ...

1

Fusion of Federated Learning andIndustrial Internet of Things: A Survey

Parimala M, Swarna Priya R M, Quoc-Viet Pham, Kapal Dev,Praveen Kumar Reddy Maddikunta, Thippa Reddy Gadekallu, and Thien Huynh-The

Abstract—Industrial Internet of Things (IIoT) lays a newparadigm for the concept of Industry 4.0 and paves an insightfor new industrial era. Nowadays smart machines and smartfactories use machine learning/deep learning based models forincurring intelligence. However, storing and communicating thedata to the cloud and end device leads to issues in preservingprivacy. In order to address this issue, federated learning (FL)technology is implemented in IIoT by the researchers nowa-days to provide safe, accurate, robust and unbiased models.Integrating FL in IIoT ensures that no local sensitive data isexchanged, as the distribution of learning models over the edgedevices has become more common with FL. Therefore, only theencrypted notifications and parameters are communicated to thecentral server. In this paper, we provide a thorough overviewon integrating FL with IIoT in terms of privacy, resourceand data management. The survey starts by articulating IIoTcharacteristics and fundamentals of distributive and FL. Themotivation behind integrating IIoT and FL for achieving dataprivacy preservation and on-device learning are summarized.Then we discuss the potential of using machine learning, deeplearning and blockchain techniques for FL in secure IIoT.Further we analyze and summarize the ways to handle theheterogeneous and huge data. Comprehensive background ondata and resource management are then presented, followedby applications of IIoT with FL in healthcare and automobileindustry. Finally, we shed light on challenges, some possiblesolutions and potential directions for future research.

Index Terms—Data Storage, IIoT, Federated Learning, DataPrivacy, Data sharing, Resource Management.

I. INTRODUCTION

With an unprecedented increase in the number of Internet ofthings (IoT) devices and emerging applications, a large amountof traffic is created every day. Such an increase poses a greatburden on the Internet network and also demands significantinvestments for the infrastructure upgrade. However, thanks tothe development of big data analytics and artificial intelligence(AI) techniques such as deep learning (DL) and machinelearning (ML), the data collected can be effectively exploitedfor many purposes. From the communications perspective,

Parimala M, Swarna Priya R. M, Praveen Kumar Reddy Maddikuntaand Thippa Reddy Gadekallu are with the School of Information Tech-nology and Engineering, Vellore Institute of Technology, Vellore 632014,India (e-mail: [email protected], [email protected], [email protected], [email protected]).

Quoc-Viet Pham is with the Research Institute of Computer, Informationand Communication, Pusan National University, Busan 46241, Republic ofKorea (e-mail: [email protected]).

Kapal Dev with CONNECT Centre, Trinity College Dublin, D02 PN40,Ireland, (e-mail:[email protected]).

Thien Huynh-The is with the ICT Convergence Research Center, KumohNational Institute of Technology, Gyeongsangbuk-do 39177, Republic ofKorea (e-mail: [email protected]).

TABLE IACRONYMS

IoT Internet of thingsIIoT Industrial Internet of ThingsDL Deep LearningML Machine LearningFL Federated LearningDP Differential PrivacyFML Federated Machine LearningFDL Federated Deep LearningAI Artificial IntelligenceDRL Deep Reinforcement LearningDML Distributed Machine LearningFTM Federated Tensor MiningIoE Internet of EverythingRFID Radio-frequency dataSDN Software Defined NetworkBF Bloom FilterFFBF Fuzzy Folded BFIDMS Industrial Data Management SystemSOA Service Oriented ArchitectureIWNs Industrial Wireless NetworksCCPSA Cyber Physical System ArchitectureCPS Cyber Physical SystemsMTs Mobile TerminalsMPC Multi-party ComputationPEFL Privacy Enhanced Federated LearningPSO Particle Swarm OptimizationBS Base stationMBS Macro Base StationSBS Small Cell Base StationIFL Industrial Federated LearningDTWN Digital Twin Wireless NetworksDFRL Deep Federated Reinforcement LearningDFQL Deep Federated Q-LearningMAQL Multi-Agent Deep Q-learningUAVs Unmanned Aerial VehiclesGFC Ground Fusion CentreLSTM Long-Short Term MemoryFIL Federated Imitation Learning

the last few years has witnessed the emergence of AI ap-plications in various fields. For example, ML is employedto investigate an efficient antenna selection in multi-antennawireless systems [1], DL is used to handle the computationoffloading problem in IoT systems with edge computing [2],and deep reinforcement learning (DRL) is used to optimizeresource allocation problems at the network edge such astraffic classification, edge caching, network security, and dataoffloading [3]. However, the conventional AI models typicallyrequire central processing of data collected from all the usersin the network, that is, users should upload their own datato a central server for training the learning model. However,a critical concern with the central learning is data privacy,i.e., some users want to keep track of their local data and

arX

iv:2

101.

0079

8v1

[cs

.NI]

4 J

an 2

021

Page 2: Fusion of Federated Learning and Industrial Internet of ...

2

do not want to transmit their local data to the central server.Training the learning model centrally requires a central cloudwith immensely powerful computing capabilities and storagecapacities. Meanwhile, recent advances in computing hardwareand the proliferation of smart devices in our daily lives haveshown that each IoT device can be equipped with reasonablecomputing and storage levels, which is closely comparable toa desktop computer ten years ago [4]. Therefore, the standardML model is not readily applicable to large-scale IoT networksand cannot exploit the availability of distributed computing.This calls for a new learning model that leaves the trainingdata distributed across individual IoT devices instead of beingcentralized.

Motivated by this issue, Google invented the concept of fed-erated learning (FL) for on-device learning and data privacypreservation [5]. Using the FL approach, each IoT device cantrain its model based on locally collected data. Local data fromIoT devices does not need to be transmitted to the centralizedcloud. The centralized cloud only needs to collect the updatedlocal training model from individual users. Thanks to itscharacteristics, FL has been adopted in many applications, forexample, FL for improving Google keyboard suggestions [6],FL for healthcare [7], FL for smart city sensing [8], and FLfor medicine [9]. To pictorially illustrate the concept of FL, anoverview of FL in IoT systems is shown in Fig. 1. In general,each IoT device has its own dataset and the aggregation servercan be either located at the network edge or a virtual cloudin the remote cloud computing system [10]. Each FL modelhas its own pros and cons, depending on various factors. Forexample, FL with the server at the network edge is suitablefor applications requiring low latency, location awareness, andnetwork contextual information [4] while cloud-based FL issuitable for applications with massive IoT devices over mul-tiple regions and requirements of powerful computing/storagecapabilities.

Although FL has found many benefits in enabling privacy-preserving learning solutions for IoT systems, it also poses sig-nificant challenges such as resource management, robustness,security, and incentive mechanisms. These challenges wouldbe more significant and difficult to handle in industrial IoT(IIoT), in which much higher levels of security, safety, andreliability are demanded. To enable FL in IIoT applications, anumber of research works have been dedicated over the lastfew years. For example, Liu et al. [12] investigated a solutionfor IIoT data sharing applications using blockchain and FL andshowed that the proposed scheme can enhance security andperformance utility without requiring a central server for train-ing. Similar to [12], Zhang et al. [13] employed blockchainand FL to detect device failure in IIoT environments and tosecure raw data of IIoT devices. More recently, Liu et al. [14]developed a communication-efficient FL approach to detectdevice failures in IIoT through the use of FL and the long-shortterm memory (LSTM) model. The advantage of this work isthat data privacy is preserved via FL and at the same time time-series data in IIoT can be efficiently handled via the LSTMlearning model. In spite of the fast development of FL and therapid emergence of many IIoT applications, there is a lack ofa comprehensive survey that provides an overview of FL, IIoT,

and applications of FL to IIoT areas. Motivated by this fact, inthis work we will be focusing on providing the fundamentalsof FL, IIoT, and reviewing state-of-the-art research works onFL for IIoT applications.

A. State-of-the-Arts and Contributions

Although there have been a number of surveys focusing onFL and IIoT, these two topics are usually studied separately.With regards to IIoT, the most well-known survey is presentedin [15] with an introduction to IIoT, enabling technologies,major applications, and research outlooks. Similar surveysof IIoT on enabling technologies and applications can befound in [16], [17], in which the term ”Industry 4.0” is usedinstead. The roles of big data analytics in IIoT and a thoroughclassification, including data sources, big data tools, big datatechniques, requirements, applications, and analytics types, arepresented in [18]. More recently, a survey on IIoT security andthe application of edge/fog computing is presented in [19].

With regards to FL, several surveys have been conductedover the last few years such as [20]–[26]. The two surveysin [20], [21] present an introduction to FL along with generalapplications and challenges. Lim et al. [22] firstly providedfundamental features and unique characteristics of FL, andthen reviewed three major aspects of FL at mobile edgenetworks, which are communication costs, resource allocationoptimization, and privacy and security. In [23], [24], securityand privacy issues in FL systems are reviewed. In particular,Mothukuri et al. [23] tried to address four major questions,including the source of vulnerabilities, the kinds of securitythreats/attacks, comparison with security and privacy issuesin conventional and distributed ML methods, and the defen-sive techniques. A survey on enabling technologies, networkprotocols, and promising applications of FL can be foundin [25]. A short survey on personalized models in FL ispresented in [26], which concluded that personalized modelswould enhance the learning performance of individual usersin comparison with the global model and local models. Wecan also find interesting topics about FL in recent magazinearticles. For example, the integration of FL into 6G wirelesssystems is discussed in [27], FL for intelligent fog/edge cloudradio access networks is introduced in [28], a frameworkof reliable FL is proposed in [29], FL-based intelligent IoTover controllable communication channels via reconfigurableintelligent surfaces was studied in [30], and incentive resourceallocation mechanisms using game-theoretic approaches areinvestigated in [31]. A recent survey on the integration ofFL and IoT is provided [32]. This survey also provides aclassification of FL-IoT studies, which are cloud server design,miner classification, edge collaboration, federated optimizationschemes, incentive design, security and privacy, and globalupdate strategies.

Different from existing works, in this work, we set toprovide a survey on the use of FL for IIoT applications. Tothe best of our knowledge, this work is the very first attempton the integration of FL and IIoT. The primary contributionsoffered by our work can be summarized as follows.

Page 3: Fusion of Federated Learning and Industrial Internet of ...

3

...

Radio

tower Server

Local

updates

Participants

Model

aggregation

Global

model

Selected

participants

Model

aggregation

Cloud

Global

model

Fig. 1. An overview of FL in IoT systems, adapted from [11].

• We present the fundamentals of distributed learning, FL,IIoT architecture and characteristics as well as motiva-tions of integrating FL with IIoT.

• We review the state-of-the-arts on FL-IIoT, including FLfor enhancing security and privacy in IIoT, data man-agement and resource management in IIoT, and promis-ing application areas such as autonomous industry andhealthcare industry.

• We discuss a number of challenges associated withcurrent studies on FL-IIoT, and further highlight openresearch issues that should be efficiently addressed tofully realize FL-IIoT solutions.

B. Paper OrganizationThe remaining parts of this paper are organized as follows.

In Section II, we present fundamentals of IIoT and FL aswell as motivations of FL-IIoT integration. In Section III, wereview the use of FL to provide secure solutions in IIoT. Next,Section IV presents data management and resource allocationin IIoT and discusses the use of FL. Applications of FLfor important sectors in IIoT, including autonomous industry,smart healthcare industry, and IIoT security, are reviewed inSection V. Challenges and future directions in the use of FLfor IIoT are discussed in Section VI, and finally we concludethe paper in Section VII. For clarity, the list of frequently usedacronyms is summarized in Table I.

II. INDUSTRIAL IOT AND FEDERATED LEARNING

A. Fundamentals of Distributed Learning

The rapid increase in digitization through several advancesin information technology applications such as IoT, smartcities, industry 4.0, etc. has led to tremendous growth in thedigital data [33]. Traditional ML approaches where the MLalgorithms are trained using the data stored in a personalcomputer or at a centralized locations may not work on real-time applications in this era of big data due to the limitationsin scalability and time complexity. If the training data is toolarge, the ML algorithms may not be trained properly due tothe memory and computational constraints. To address thisissue, distributed ML (DML) has been introduced. In DML,the ML algorithms can be trained on the subsets of the datawhich are distributed across several nodes [34], [35].

Distribution of the training data in DML can be performedby following two approaches. The first approach is horizontalfragmentation, where the instances or rows in the dataset aredivided into several fragments and each fragment is storedin a node in the distributed network. The second approachis vertical fragmentation, where a subset of attributes aredistributed across several nodes in the distributed network.The ML algorithms can then be executed in parallel in eachof the node in the distributed network, which reduces thetraining time of the ML algorithms significantly, thus making

Page 4: Fusion of Federated Learning and Industrial Internet of ...

4

the ML algorithms scalable and addressing the limitation ofmemory requirements for training the ML algorithms. Someapplications generating large amount of data in real-timewhere DML is extensively used are healthcare applications,IoT, smart city based applications, communications, etc. [36]–[40]. DML also plays a huge role in big data analytics [41],[42].

Even though DML has enormous potential in several ap-plications, it has its own limitations. Some of the limitationsof the DML are privacy, security, platform dependency, lackof standard measures to evaluate the DML algorithms [43].The following subsection introduces the concept of FL, howit addresses the issue of privacy and security in conventionalML and DML.

B. Fundamentals of Federated Learning

FL is a specific category of DML where the ML models aretrained on the data located at the decentralized devices suchas mobile phones and other smart devices [20]. The devicesparticipating in the FL need not exchange their data samples,that ensures the privacy and security [21]. The underlyingprinciple in FL is that the ML models are trained on local data,then the parameters of the ML models are exchanged betweenthe local nodes (devices) at regular intervals that enables thecreation of a global ML model. The training data remainson the individual devices. The global ML model resides at aserver. Once all the devices send their models to the server,a combined model is created by averaging the parameters ofthe individual models. In this way, the individual devices learncollaboratively from a shared model. Some of the advantagesof FL over traditional ML models are that FL ensures dataprivacy, reduced latency, reduced power consumption [44].Not only the FL ensures privacy of sensitive information ofusers, but also FL can deliver personalized ML models to theusers with enhanced user experience [29].

The basic difference between FL and DML is that the DMLconcentrates mainly on parallelization of the computing powerwhereas FL concentrates on training the datasets that aregenerated locally on devices such as mobile phones. The DMLassumes that the datasets are distributed identically (in size)across several nodes in the architecture. Whereas, in FL, thesize of the datasets at different nodes may be heterogeneous[45]. Also, the nodes in the DML are typically data centers thatcan communicate with latest communication technologies andthey have access to powerful computational resources. On theother hand, the nodes in the FL are typically devices that maybe battery-dependant and also work on communication medialike Wi-Fi and cellular networks that makes them unreliableand drop out frequently from the network.

C. Characteristics of IIoT

IIoT is collection of people, sensors, machines, and comput-ers that enable intelligent industrial operations with the helpof ML and analytics [46]. IIoT enables intelligent devices andnetworked sensors to collect the data from the manufacturingplants and make use of predictions through AI and automate

the decision making [47], [48]. IIoT offers a revival of in-dustries that have been lately struggling due to several factorsincluding shortage of skills from personnel. IIoT offers severalbenefits because of their minimal or nil dependency on humanintervention. IIoT systems can make intelligent decisions byautomatically learning from the data generated through thesensors. IIoT leads to smart machines that can achieve high-est levels of accuracy that was not possible earlier becauseof human intervention. This disruptive technology has hugepotential to transform the way the products are manufacturedand delivered, make factories smart and efficient, protect frontline workers by providing better security in difficult conditions,reduce human errors, increase efficiency, thus, in turn, savehuge amount of money [49], [50].

Many sectors and applications such as manufacturing, trans-portation system, oil and gas, healthcare, agriculture, smartcities, energy are benefited by IIoT. One of the main benefitsof IIoT is its ability to increase efficiency in operationsthrough prediction. For instance, the sensors in a machine canautomatically pinpoint specifically where the trouble is whenit goes down and place a service request automatically. Thesensors can predict the likelihood of a breakdown of machinesbefore it happens by automatically sensing the sounds of amachine, temperature, vibrations, and other factors by usingdata analytics and ML. Hence, predictive maintenance reducesthe idle time of a machine and also reduces the damagescaused by faulty machines by fixing the issues before they areescalated [51]. Another important advantage of IIoT is locationtracking of equipment and tools. In many industries, theworkers spend significant amount of time searching for tools,finished goods in an inventory. IIoT sensors saves lot of timeby providing location services, thereby making it impossibleto lose equipment, goods [52]. Until recently, manufacturingcompanies used to spend huge capital on purchasing andmaintenance of equipment. But with the advent and advance-ments in IIoT, the industries can lease the equipment from thevendors. The owners can remotely monitor these equipmentthrough the sensors, thereby delivering upgrades, repairs, andmaintenance remotely. This will save both time and capitalfor industries [53]. To summarize, the main benefits of IIoTare 1) reduced waste, 2) reduced maintenance costs, 3) energysavings, 4) workforce productivity gains, 5) improved service,and 6) revolutionary products and services.

D. Motivations of Integrating Federated Learning with IIoTThe exponential growth in the IIoT applications is being

hindered by several issues such as security, privacy, commu-nication cost, etc. Integration of FL with IIoT has the potentialto solve many of issues discussed above. The main motivationsbehind integrating FL with IIoT applications are summarizedas follows:

• Security and Data Privacy Preservation: In order toget valuable insights and patterns from the data gener-ated by IoT devices in IIoT applications, ML and DLalgorithms are used. To understand the complex patternsfrom the data generated, these algorithms are trainedregularly by large datasets collected from different in-dustries/locations. But moving these datasets to a central

Page 5: Fusion of Federated Learning and Industrial Internet of ...

5

location for training the algorithms exposes the sensitivedata to potential hackers, and intruders [54]. The datasetsneed not be transferred to a central location for trainingthe algorithms when FL is used. Hence it secures thesensitive data from potential attackers. This also preservesthe privacy of sensitive data.

• Reduced Communication Cost: The sensors in IIoTapplications generate large volumes of data. This data isstored in the edge devices for analytics and then moved tothe centralized locations such as the cloud. Transferringhuge volumes of data incurs significant communicationcost. With FL, the entire data need not be transferredto the cloud. Instead, only the summarized results afterapplication of the the ML/DL models is transferred tothe cloud. This reduces the communication cost fortransferring the data to the cloud.

• Improved Performance of the Network: A ML/DLmodel is implemented on the data generated from IIoTapplications is stored in an edge device and only the sum-mary of the results is transferred to the central locationwhen FL is used. Hence, the traffic in the network isreduced considerably. With reduced network traffic, theoverall performance of the network can be significantlyimproved.

• Scalability: Integration of FL with IIoT enables theDL algorithms to scale their learning as they need notbe trained on large volumes of data generated locally.The central algorithm is trained only on summarizedresults from the individual edge devices. This facilitatesa significant scalability of the central learning algorithmsand train from the data generated at several edge/localdevices.

III. FEDERATED LEARNING FOR SECURITY IN IIOT

FL protects user data by training the model without trans-ferring the data from client to the main server. The scope ofsecurity mechanism can be further enhanced while sharing themodel updates and integrating the data generated from similardevices from various industries. In this section, we discussabout providing different security mechanisms by integratingFL with ML and DL models in IIoT framework. FL withblockchain in IIoT is also presented in this section.

A. Federated Machine Learning for Secure IIoT

The amount of data generated from a single industry will notbe suitable for applying ML techniques [55]. Generally, manyindustries would have similar IoT surveillance systems [15]and so the data generated can be federated from all the devicesthat could lead to increase in the accuracy of ML algorithms.However, the key concern is the data security [56] providedduring leveraging of data. Federated ML (FML) has gaineda lot of attention because the way it handles the privacy bydecentralizing the data generated at the end user device andaggregation of ML models at a centralized server. The twotypes of attacks in IIoT namely eavesdroppers and hackerswere discussed in [57].

1) Eavesdroppers: Attack happens when the data is trans-ferred through communication channels. In this case, itmay happen between IoT node and IoT sink, then fromIoT sink to centralized server communication channels.

2) Hackers: Attack that happens in the centralized server. Astalker resides in the server and is capable of obtainingactual locations of a user.

As discussed in [57], initially the data is generated fromIoT devices from every smart industry sent to the IoT sink.The sink is a repository for collecting the data from differentIoT nodes in the industry through wired and wireless com-munications and it also encrypts the data which is sent to thecentralized server. Then the server collects the informationfrom multiple IoT sinks and federates the data. Finally, thesmart industry decrypts the knowledge into an understandableformat. In this case, both eavesdroppers and hackers attacksare not possible, as encrypted data is present in both the cases.

The main intention of FL is to collectively learn the globalmodel without sacrificing the privacy of data. But sharingmodel updates to the server during training process maylead to an issue of leaking sensitive data like user personalinformation. In order to preserve privacy in ML models,number of methods are employed in FL.

• Differential Privacy (DP): DP defines the amount ofinformation about the data that can be available forthird party analysis [58]. Information under DP can becategorized as general information which has the infor-mation about the entire population and the other typeis private information that has the personal private data.Some of the properties of DP framework to provide sensi-tive personal information and privacy protection includesquantification of privacy loss, composition, group privacyand closure under post-processing.

• Homomorphic Encryption: Computation and analysis isdone based on encryption technique so that the attackerfinds very hard to find the original information [59].

• Secure Multiparty Computation: It is a model wheremultiple parties collaboratively compute without leakingany information to the third parties [39].

To summarize, [58] DP does not guarantee complete datasecurity because if the algorithm wrongly classifies a privateinformation as general information, then the algorithm fails toprotect the data. In spite of all these drawbacks, DP has provedto be more efficient privacy preserving approach compared tothe traditional centralized method.

B. Federated Deep Learning in IIoT

Recently, integration of DL models with IoT and edgedevices have become more popular which provide real timeanalytics with limited resources [60]. So, federated DL (FDL)enables Industry 4.0 companies to integrate DL in IoT de-vices and provides secure framework using FL as shown inFig. 2. DL is computationally expensive which requires costlyresources and framework. So decentralising DL models isa multi-dimensional issue which requires new technologiesframework to integrate DL with edge computing and IIoT. Themain goal of FDL is to provide IIoT with advanced capabilities

Page 6: Fusion of Federated Learning and Industrial Internet of ...

6

Optimize

Compression

Deploy

Serv

er Sid

e

Original DL Model Public

Training of DL Model Private

Edge Computing

Local Storage

Differentially Private SGD

Factory Machine SmartSensor

IoTDevice

Updates

Tra

ined

Mo

del

Industry Side

Clou

d

Fig. 2. Federated DL in IIoT.

using optimized DL which would transform the Industry 4.0factories into smart factories. Some of the parameters that arerequired to build the FDL model in IIoT are discussed below.

1) FDL Model: A FDL model can be implemented on bothclient and server side. In client side, private networks aredefined which have the DL model tuned and optimizedfrom the general model present in the cloud. Optimizedand tuned models are then deployed on the client side,where the model is trained with locally generated datafrom the end device. Finally the end device containshighly quantized and compressed FDL model. On theserver side, model present in cloud is continuouslyupdated by integrating differentially gradients from eachprivate network. Each local DL network takes its turn tocontinuously upload and download the currently updatedgradients to the cloud model. So, distributed selectivestochastic gradient descent approach is presented in [61]which can be applied in cloud model to frequentlyupdate local private model. The first decentralised modelnamed “Model Chain” uses blockchain technology [62],[63] to enable privacy preserving in the data transfer.In addition to that, asynchronous stochastic gradientdescent can also be used when a single model can betrained in parallel among all the devices, aggregated andprocessed.

2) FDL Communication and Networking: The main ad-vantage of using FDL is to run DL models in IoT devicesand to involve model in decision making process. Thiskind of decentralizing DL process enhance robustness,operational efficiency and reliability of IoT devices.FDL provides two types of communication, namely,intra-communication channel and inter-communicationchannel. Former transmits the data among all the tiersin the framework. FDL communicates between the IoTand cloud tier where the optimized model in the cloudis deployed to the end device. However, the security andprivacy must be maintained in the FDL during communi-cation. In inter communication channel, the componentsin each layer communicates with each other in threedifferent ways such as cloud, edge and end device. Themain aim of FDL is to minimize intra-communication

TABLE IIDIFFERENCE BETWEEN FDL AND FML.

Features FDL FMLCommunication Communication between

cloud and IoT is possibleCommunication betweencloud and IoT is not pos-sible

Privacy Privacy is less than FML Privacy is moreData storage Private DL model is

maintained in cloudNo data resides in cloud

Updates Gradient DL models up-dates are sent to thecloud

Statistical updates aresent to the cloud

Devicesconnectedwith Cloud

IoT devices are directlyconnected with cloud

Edge devices act as abuffer between IoT andCloud

Model access Model access is private Model access is publicData control Data control is central-

izedData control is decentral-ized

Optimization Optimization is required It is not required

and maximize inter-communication which would signif-icantly reduce the cost of communication.

3) FDL privacy and security: Lim et al. [22] discussedabout various security threats while FDL shares DLmodels from cloud to end devices. Anyway, FDL buildsDL models that do not expose information about thedata to the cloud. Security issue on the server sideincludes sharing of DL models on the cloud that leadsto confidentiality and security risk. Security issue onthe client side is done by encrypting the data duringthe training process before sending it to the cloudserver. Some mechanisms namely DP and homomorphicencryption technique controls the amount of data to beshared on the cloud.

4) FDL Optimization: As the end devices have limitedmemory and computational requirements, DL modelsmust be optimized so that they can be deployed onIoT or end devices efficiently. In terms hardware op-timization, GPU offers low power computation thatdecreases computation time. FPGA and Google’s TensorProcessing Unit [64] are other DL devices that booststhe processing of DL network. In terms of memoryoptimization, algorithms like shared memory allocationalgorithm can be used for DL models. Dynamic schedul-ing [65] is one of the key process used for optimizingthe performance on the cloud server.

To summarize, the difference between the above two sec-tions FML and FDL is presented in Table II. Recently,researchers in [66]–[69] proposed DL models in FL for IIoTnetworks in various application such as automobile and mobilenetwork, traffic network and in image processing applications.

C. Blockchain in IIoT using FL

Generally, FL uses a central server to aggregate all the up-dates of ML and DL models and communicates the aggregatedupdates to the global model. Security breach can happen whenthe updates are transferred to the server or when the globalmodel is sent to the clients. So to overcome this securityissue, the blockchain concept can be applied to store the

Page 7: Fusion of Federated Learning and Industrial Internet of ...

7

model updates in immutable form. Yin et al. in [70] presenteda secured model that uses blockchain as a decentralizedframework to replace the traditional central servers. Also, itstores the historical information as tamper resistant data. Lu etal. [12] investigated some security issues in node, block andframework which are designed using blockchain in FL.

• Node security: In terms of node security, blockchainensures the node permission control using alliance chainand data security part is implemented using the chainof nodes. Each node is created when the parameters areupdated or model is modified. Each validated updateis stored in the form of chain which is untampered orimmutable. Committee consensus mechanism can alsobe used to validate the node to verify whether they aremalicious or not. Committee allows the good node tosend the updates and change the global model for efficienttraining. If it is a malicious node updates, then the FLwill ignore the updates which would affect in designingof global model.

• Block security: It is a vital process in blockchain toimplement the consensus for each block. Executing andbroadcasting the consensus is highly time consumingtask, so it is inevitable to reduce the consensus cost.

• Model Security: The data is secured in FL as there is notransfer or exchange of information between the clientand the server. But the global model constructed in thecentralised server can be hacked by the unauthorized per-son. Model trained in the central server is more exposedto attack compared to model stored in local server. Sothe model build and trained should be preserved at boththe ends.

The workflow of FL with blockchain is shown in Fig. 3. IIoTdevices are designed in such a way that they train the modelwith local data. After training, they send the local updates tothe edge node where all the local updates are consolidated.Edge nodes in turn send aggregated local updates to the fognode where the updates from all the edge node are aggregated.Then, the updates are converted into a block, which is verifiedbased on the smart contract. The validated block is joinedwith onchain which is tamper-proof and secured. Finally, allthe local updates are received by the centralised server to trainthe global model. Then the updates of the global model is sentto all the edge node to update on all the end devices connectedto the edge node.

The framework must be scalable for real world applicationsthat can adapt to different kind of scenarios. In recent works[71]–[73], researchers have applied blockchain in image pro-cessing and vehicular networks using FL in IIoT framework.The works in [74]–[76] designed a FL based blockchainframework for health industry, railway network and defenceorganization. Generally, privacy preserving mechanism in FLusing IIoT will be applied on end to edge device communi-cation and from edge to collaborative model. Even in worstcase also only the model updates can be tampered betweenend devices to collaborative engine. Recent studies of FL inML, DL and blockchain in IIoT are summarized in Table III.

Ed

ge

Co

mp

uti

ng

Edge 1 Edge 2 Edge 3

Sensors Machine Tools Smart Instruments

Collaborative Model

Fog

Com

pu

tin

gB

lock

chai

n M

od

el

ON

Ch

ain

OF

F C

ha

in

Validate Vallidate

Server/Global ModelCloud

Block 1 Block 2 Block 3 Block N

Request Response

Fig. 3. Blockchain in IIoT using FL.

IV. DATA MANAGEMENT AND RESOURCE MANAGEMENT

Internet of everything (IoE) is a form of the network, in-cluding physical, digital and virtual objects [80]. These entiregroups of objects are networked together to serve the users’requirements. The requirements are served by communicatingwith each other and these are performed over the Internet. Thedata generated by virtual objects as well as physical objectsare heterogeneous and huge in volume such as logs in theirSCADA systems [81], data from various business-orientedapplications [82], radio-frequency data (RFID) [46], data fromwearable devices [83], data from sensors [84], real-time webdata [85], media data and location data [46]. So the availabletraditional data management techniques fail to handle suchheterogeneous and huge data. There are a variety of distinctsolutions, such as architecture-based solutions, middle-warebased solutions and indexing solutions. Though traditional MLhad ruled the various intelligent smart industrial applications,most of the IIoT applications are not widely deployed in real-time. The reason for the gap in real-time implementation isdue to sensitivity of the data and hesitation of the industryin sharing their sensitive information over the Internet andalso due to resource constraints in handling the communicationcost, throughput and latency. FL can be utilized to overcomethese challenges as the technique is specifically designed tomaintain data privacy by avoiding global sharing of data. Thissection investigates the contributions of researchers in the fieldof FL for data management and resource management.

Page 8: Fusion of Federated Learning and Industrial Internet of ...

8

TABLE IIIOVERVIEW OF RECENT STUDIES ON FL WITH SECURE IIOT.

Ref Framework Algorithm Application Techniques Contribution Inference Limitations challenges[57]

FMLin IIoT

FTM Smart factoriesthat belong tosame verticalindustry

FTM with homomorphic en-cryption which leverages thelinear transformation

◦ Every factory has to share its ciphertextdata◦ Mines the same knowledge from differentfactories◦ Defend the attacks from distributed andcentralized hackers

◦ Challenge to maintain balancebetween ML and data security◦ FTM needs the centralisedserver to operate the federatedmining

[58] Primalchain NLP and speechrecognitiono

DP ◦ Uses FML to generate global representa-tion of distributed knowledge in IIoT envi-ronment◦ Blends DP, FML and blockchain

◦ Large scale ML latency is more◦ Efficiency can be improved

[77] TensorRidgeRegression

Image processing Parallel factor decomposition ◦ Projections of input tensor to more thanone directions◦ Multilinear mapping from tensorial inputspace to continuous output space

◦ High dimensional datasetsleads to overtraining, high com-putational complexity and largememory requirement

[78] STA Computer visionand patternrecognition

Tensor Alignment Technique Unsupervised tensor feature extractionEnhances the robustness in the alignmentstep◦ Works good for high dimensional space

Sparse projection learning meth-ods are not investigates for tensorrecognitionNot efficient for high dimensionalspace

[79] CPSS Wireless IoT ◦ Privacy preserving high or-der singular value◦ Decomposition based crite-rion

◦ Model allows users to utilize the storageand computing resources of cloud and fogwithout leaking any sensitive information

As it uses encrypted data, cost isvery high

[66]

FDL in IIoT

DeepPAR

DeepDPASmart factories ◦ Proxy re-encryption

◦ Group dynamic key manage-ment

◦ Protects each participants input privacywhile preserving dynamic update◦ Guarantees secrecy in group participants

Still security mechanism at edgedevice can be improved

[67] DoubleDeep Q-network

Mobile networks Decentralised cooperate edgecachingMarkov decision process

◦ Federated reinforcement learning deepframework enables fast training and decou-ple the learning process from cloud◦ Reduces performance loss and averagedelay

◦ Decentralaized loss can bemaximized

[68] FedGRU Traffic network ◦ parameter aggregation◦ Federated averaging algo-rithm◦ Joint announcement protocol

◦ Random subsampling for participating or-ganization is used to reduce the communi-cation overhead◦ Captures the spatiotemporal correlation oftraffic flow data

Spatial temporal dependency canbe better captured by applyinggraph convolutional network

[69] ASTW-FedAVG

Image Processing ◦ Asynchronous strategy◦ Weighted aggregation

◦ To reduce the communication between theclient and server◦ Asynchronous strategy is used to aggre-gate and update the parameters◦ Previously trained local models are aggre-gated to enhance the performance

◦ Only server are involved inevolving the local model◦ Communication cost can be re-duced

[70] Federatedcollab-orationframework

Wearable sensordata

Private and public data centerblockchain

◦ Framework for collaborating multipartycomputation is designed◦ Blockchain is used to overcome securityissues

Large scale multiparty secure col-laboration of IoT data is chal-lenging

[71]

FederatedBlockchain

BlockFL WirelessCommunicationNetwork

◦ On device ML◦ Smart Consensus mechanism

◦ Verified local updates are generated asblock and added to blockchain◦ Each device generate global model fromthe updated new block

Proof-of-work can be hacked bythe malicious miners

[72] FL-Block Image processing ◦ Decentralized privacy◦ Poisoning attack

◦ Allow local updates to exchange withglobal blockchain model◦ As the central authority is replaced byblockchain, privacy is achieved more

Tradeoff between privacy protec-tion and efficiency should be opt-mized

[73] BFL Vehicularnetwork

◦ Renewal reward approach◦ VML

◦ Provide efficient communication of au-tonomous vehicles◦ It forms representative groups from theoverall vehicles

Choice of block updates willlimit the increase in overall delay

[74] CrowdSFL Health Industry Crowd sourcing encryption al-gorithm

◦ Data privacy is controlled by smart con-tracts◦ The updates submitted by the local devicecannot be leaked as it enables the decen-tralised blockchain platform

◦ Reward distribution mechanismis not defined◦ Model can be optimized

[75] SVMand Deepnetwork

Railway network SVM based on mixed kernelfunction and multi class model

◦ Distributed asynchronous collaborativeML is designed without involving the cen-tral server◦ Smart contract involves all participant andgenerates new block

Accuracy and efficiency of theprediction can still be maximized

[76] IoBT Social Securitythrough defenceorganization

◦ Multilayered model◦ Defence led combat

◦ Distributed computing defence frameworkfor sustainable society◦ Optimize the data trained in local nodes

Participation of local node intraining process can be rewardedbased on their contribution

A. Data Storage in IIoT

The world has now stepped into the era of IoE. IoT is asubset of IoE and IIoT is a subset of IoT. As we move furtherinto this era, the data produced and gathered goes beyondtetra-bytes of data. This data is structured, semi-structured

and, in some cases, unstructured. This huge volume of dataproduced by the IIoT environment is of great value and canbe used to extract knowledge about the manufacturing unitor maintenance unit or any unit in an industrial environmentso that any kind of abnormality can be prevented in the

Page 9: Fusion of Federated Learning and Industrial Internet of ...

9

near future. But storing such huge data and processing themto extract knowledge and reacting at the right time is thebiggest challenge now. The technologies, namely fog com-puting, cloud computing, and edge computing have given theresearchers an opportunity to solve the issues with respect todata storage [86].

In [87], Fu et al. proposed a framework for processing theIIoT data. Almost all the data life cycle phases are integrated,starting from pre-processing to the retrieval of the requesteddata stored or archived using fog and cloud computing. Theframework was designed to have entities, namely edge server,proxy server, and cloud server for taking care of data storageand processing. The time-sensitive data generated by the IIoTdata sources is received by the edge servers where the data isprocessed at a fundamental level. If the data in the edge serveris needed for future usage, then those are transmitted to theproxy server where the quality of the data is improvised bypre-processing, transforming into structured data so that theycan be stored in the cloud server as historical data for furtherpredictive analysis.

Borylo et al. [88] proposed and presented a scheme nameddynamic resource provision by combining the cloud and edgecomputing schemes for the purpose of energy-awareness. Inthis scheme, the major limitation was that the amount of en-ergy consumed was high. In order to overcome this limitation,Kaur et al. [89] designed an architecture consisting of threedifferent layers, namely cloud layer, edge layer and software-defined network (SDN) layer. The SDN layer plays a majorrole in handling network congestion and hence minimizesenergy consumption.

In [90], Singh et al. proposed a data storage method usingbloom filter (BF) and fuzzy logic, namely Fuzzy FoldedBF (FFBF). Two different BFs are compressed into a singleBF by using fuzzy operations. The major advantage of thiscompression was that in a single BF, a large volume of dataelements was stored. The rate at which the data decayed wasreduced so that the data streamed were residing for a longertime. The computational cost was reduced due to the usage ofthe double hashing technique.

An integrated framework, namely IoT based Industrial DataManagement System (IDMS), for an IIoT environment wasproposed by Saqlain et al. [91] for handling and managingthe huge industrial real-time data acquired from a smartmanufacturing unit. This framework tried to monitor thishuge real-time data online and hence make wise decisions.For the purpose of handling emergency events, traditionalcommunication protocols were used. The middle-ware, whichwas designed in the IDMS, provided a Service OrientedArchitecture (SOA). The raw data was transformed into astructured format by the framework so that it can be archivedin the cloud server where useful information and knowledgecan be extracted.

In [92], Anton et al. presented a model that can smartlycollect, aggregate, and analyze the data collected from het-erogeneous sources. The model consists of three levels. Thebase level, that is the first level, is responsible for collectingsingular packets which are correlated along with various logsand the collected packets are analyzed here. The second level

is responsible for aggregating all the packets that flow inthe network and the information is extracted based on theapplication. This level is more application-specific. The thirdlevel which is the highest level in the aggregation model,aggregates all the raw data and meta-information related toeach application into a connected graph which gives an overallpicture of the flow of packets in the network.

An IIoT based intelligent control and management systemwas designed by Du et al. in [93] for the purpose of themotorcycle endurance test. The architecture designed in thissystem is made of four layers, namely the executive layer forthe sake of data acquisition, the cognitive layer which takescare of the business logic, the network layer which acts asthe communication layer and finally, the control layer. Thesystem provides an interface for establishing a relationshipbetween the protocols and the other external parameters. Thedesign enables easy management, up-gradation, extension andcompatibility.

Wan et al. [94] analyzed the general architecture of IIoTthat includes the physical layer comprising of the devices,industrial wireless networks (IWNs) dealing with protocols,industrial cloud for storage and smart terminals for betterinteraction. As a result of this analysis, this work proposeda software-defined based IIoT architecture consisting of threelayers named physical infrastructure dealing with the imple-mentation as well as production, control layer taking careof the interaction between physical and application layer andfinally, the application layer for innovative applications. Theyalso assessed the performance of this architecture using a test-bed. The results showed an improvement in the utilization ofthe equipment and also reduced the consumption of energy.

Shu et al. [95] proposed a novel architecture termed as cloudintegrated cyber physical system architecture (CCPSA) whichcomprises of three domains: one domain which focuses onthe network and their communications, other-oriented towardsthe cloud platform dealing with the control and computationalaspects and the final domain is a data-centric CPS. Thiswork also focused on providing solutions for challenges likeresource management, scheduling aspects of resources. Liu etal. [96] proposed a methodology for acquiring and collectingdata in an efficient manner so that no data is missed dueto the limited energy in the smart terminals using Ethereumblockchain and DRL. The DRL increases the amount of datacollected at a time by the mobile terminals (MTs) and theblockchain enables higher security.

To summarize, most of the available data acquisition anddata storage techniques as well as frameworks focus onspecific issues, but acquiring and storing such huge hetero-geneous industrial data is a challenging task. Also the datacollected from various factories distributed at different floorsand locations can be of different formats and volume [97].Now, most of the industries are shifting towards digitizationfor achieving their goals. Few challenges that can be listedout are data dynamicity, data visualization, storing industrialdata. To meet these challenges a well-versed data managementframework has to be devised that can efficiently handle thehuge data.

Page 10: Fusion of Federated Learning and Industrial Internet of ...

10

B. Federated Learning for Data Management

The IIoT era has enabled the heavy flow of data packets overthe Internet, which has increased the traffic. Moreover, recentadvancements have made it easier for the industrial giants togather huge volumes of data from devices located in differentgeographical locations. Though huge data is collected, most ofthe industrial applications lose their value in the market as theusers are not satisfied. The applications need to process thehuge data for making any decision by transferring them to acentralized server or location, which requires high bandwidth,but they fail because of a huge delay in the transportationof data from heterogeneous devices. To overcome this, fewheterogeneous architectural based solutions were providedlike usage of edge computing, fog computing, etc. But thesesolutions still have challenges like security, privacy, etc. Themajor issue with these kinds of architecture-based solutionsis that most of the decisions are made locally, neglectingthe global data, model and knowledge. This issue could besolved by the usage of a technology proposed by Googlenamed FL [25]. This section discusses the various approachescoined by various researchers for handling the influence oflocal knowledge over the decisions.

Wang et al. [98] proposed a framework named ”In-EdgeAI” which imparted intelligence to the mobile edge networkusing DRL and FL. This framework is designed to reduce thevolume of data that needs to be transferred to the central serverand to provide a better data management scheme to adapt theheterogeneity. Another set of researchers integrated FL withimitation learning (IL) and proposed a framework for improv-ing the performance and the learning capability of a robot [99].The framework was designed for cloud based robotic systemswhich gathered sensor data from heterogeneous resources. Yinet al. [70] proposed a framework based on FL for providing asecured collaboration of data. The framework was named asFDC and enabled data collaboration among multiple partiesdistributed across different geographical locations.

Lu et al. [12] designed a methodology by integrating theblockchain and FL for enhancing the computational efficiencyand to secure the sharing of data among the distributed clientsas well as the server. This work provided an IIoT test bed foranalyzing the performance of their framework. FL is used forsolving the data sharing problem in typical IIoT environment.The presence of a centralized server gave rise to so many dataleakage problems. Most of the clients hesitated to share theirprivate raw data over the network to a centralized location.Hence, a methodology using FL was designed in which themodel or the structure of the data was shared to all clients inthe IIoT Network instead of leaking or revealing the raw data.

Zhu et al. [100] proposed a methodology to optimize theirneural network model with the usage of evolutionary algo-rithms. This work used multi-objective functions for reducingthe computational cost. FL is used for the purpose of solvingthe scalability issues in the DL models and also to increasethe learning efficiency and performance of the global modelusing the data distributed across the clients.

A methodology called two-phase multi-party computation(MPC) was proposed by Renuga et al. which is an aggregation

model using FL concept [101]. The model provided privacyto huge sensitive data gathered from various companies. Thisfollowed a two phase methodology where initially a smallcommittee was elected and later MPC based aggregationservice was provided via the elected committee. The modelwas integrated along with a smart manufacturing IoT envi-ronment and the results showed high accuracy with reducedcommunication cost and computation time.

FL is now integrated with DL models to provide intelligenceto industrial applications. A scheme by name Efficient andPrivacy Enhanced FL (PEFL) is proposed in [102]. Thisscheme is best suited for industrial sectors dealing with sen-sitive data like healthcare, auto-pilot, auto-driving, industrialrobots, etc. where decision making cannot be compromiseddue to hesitation in sharing data. The major advantage of thisscheme is that the performance is not reduced even if there isa collision due to multiple entries.

A cross domain sharing based scheduling scheme [103] wasproposed for providing edge services with higher intelligence.The scheme integrated the block-chain and edge intelligencemechanisms to provide an intelligent IIoT framework. Theresults showed an improvement in service cost of the edgedevices and their efficiency in terms of service. The proposedscheme is suitable for IIoT networks beyond 5G. This workalso designed a transaction approval scheme for reaching theedge resources which are distributed all over.

An optimization technique using particle swarm optimiza-tion (PSO) and FL is proposed in [104] focusing on thehyper-parameter tuning for the DL models that are availablelocally in the smart city applications. PSO is used for settingthe hyper-parameters in case of the DL model LSTM. Theresults showed a reduction in number of rounds required foridentifying the parameters by 2% to 4%.

To summarize, there are various methods and buildingblocks that can be used for customizing the traditional MLand FL algorithms to suit the IIoT applications. Thoughthere are too many architectures available for FL, there is nospecific architecture that seems to be centered around specificindustrial sector. The algorithms and methodologies discussedin the section can work for any type of industry once modifieda bit keeping in mind various aspects such as efficiency, deviceset up, end users, end servers, structure, autonomy. etc.

C. Federated Learning for Resource Management

The recent industrial revolution uses a wide variety oftechnologies like collaborative robotics, cloud and edge com-puting, cognitive computing, CPS, ML, etc. [105] Thesecollaborative technologies have led to high level of industrialinnovations and applications. Integration of Industrial appli-cations with IoT and ML leads to various advantages likeoptimized production costs, increased productivity, reductionof error, improvising the automation process, higher quality,and so on. Hence ML plays a major role in any smart industrialapplication. But, the traditional ML aspects could not be usedin real-time Industry 4.0 applications as data cannot be sharedfor training in a centralized ML server [22] as there is a riskof data leakage [106]. Hence most of the IIoT uses FL which

Page 11: Fusion of Federated Learning and Industrial Internet of ...

11

is an efficient resource management technique for learning,adding new data and updating the aggregation server withthe model updates. This section discusses about the variouscontribution of researchers in the field of FL for optimizingand managing resources over real-time processing in Industry4.0 applications.

In [107], Chen et al. analysed the performance of FL withthe help of packet error rate over wireless networks. A closed-form expression was derived for calculating the convergencerate of FL. This was utilized for computing the optimal trans-mission power for allocating a resource block and selectinga user. This work also minimized the loss function for theabove said resource allocation and user selection. This workwas analyzed using a single base station (BS) scenario, andthus the blooming researchers can concentrate on analysingthe performance of FL for multiple BS scenarios. In [108],Abad et al. proposed hierarchical FL for optimizing the usageand allocation of resource in wireless applications. This workconstructed a heterogeneous cellular network which consisteda macro base station (MBS) and few small cell base stations(SBS). In the SBS, sub-global model is trained using thetraditional FL concept for specific iterations and this processis carried out in all SBS. The trained sub-global models in allSBS are then transferred to the MBS for aggregation of theglobal model. The global model updates are finally transferredto the SBS from where the updated parameters of the globalmodel are broadcasted to the end devices. This technique iswell suited for reusing the wireless resources. Though thisseems an attractive technique, the scheme follows a centralizedMBS which might crash due to physical damage or attackand hence there is a chance of losing the updated globalparameters.

Huang et al. [109] proposed a novel approach named Deep-Q for the purpose of enabling resource allocation and taskoffloading for networks in edge computing. This work focusedon offloading multiple tasks at the same time to the edgeservers. The proposed technique helped in minimizing the costdue to delay, computation and energy consumption. In [110],Khan et al. proposed a novel dispersed FL to be employed insmart industries. This work focused on optimizing the resourcein cognitive IoT based smart industrial applications. This workproposed a model which used distributed and hierarchical FLfor offering optimized resource communication and robust-ness. This work also formulated an integer linear programmingproblem for minimizing the overall cost of FL. Numericalresults were provided to compare the proposed approachwith two techniques namely random resource allocation andrandom association.

Hiessl et al. [111] tried to adapt the available traditionalFL for industrial environments and named it as IndustrialFL (IFL). This work introduced the various requirements anddesigns required for IFL. IFL trains and evaluates the MLmodels. This work also discussed the needs for IFL andalso provided a collection of structured workflows for therequirements in the architecture of IFL. To get adapted tothe diverse operating conditions of the industrial environment,the FL clients are not allowed to exchange the ML modelparameters with the FL participants in the IFL architecture.

A discussion on future such as usage of FL open sourceframeworks, asynchronous and decentralized FL, and negativeknowledge transfer were also provided.

In [112], Lu et al. introduced the usage of blockchainbased FL in the digital twin wireless networks (DTWN). Themethodology helps in improving the reliability, security anddata privacy by which the learning accuracy is balanced usingthe bandwidth allocation process. The designed DTWN modeluses the digital twins concept for mitigating the unreliablecommunication among the edge servers and users. The limitedresources available in the network are allocated using theoptimization problem that is formulated between the BS andthe digital twins. The major advantage of the scheme is goodlearning convergence and latency.

Messaoud et al. [113] focused on developing a FL basedmethodology to support the resource allocation strategy inmulti-industrial IoT. This work proposed a novel deep fed-erated reinforcement learning (DFRL) scheme for providingdynamic resource allocation and network management for theIIoT networks. The scheme provided IIoT slice based resourceallocation with the help of a novel proposal Deep FederatedQ-Learning (DFQL). This work also proposed two other ap-proached namely Multi-Agent deep Q-learning (MAQL) thatdynamically slices the transmission power and spreading factorfor maximizing the QoS requirements namely throughput anddelay. The DFL is also utilized for learning the multi agentmodel to enable decision making on the IIoT virtual networkslices.

Edge intelligence plays a significant role in IIoT since itprovides smart services via cloud with low latency rate andlesser cost. Since IIoT is a heterogeneous network with diverseresources, there is a need to integrate all these together toimprove the computing capability. Zhang et al. [103] proposeda scheme for cross-domain sharing of resources by usingan edge scheduling scheme based on DFL. This scheme isdesigned based on a credit-differentiated edge transaction ap-proval mechanism which enhances the flexibility and securityin the service provided by the edge with a minimal cost.

Unmanned aerial vehicles (UAVs) play a promising rolein IIoT applications, but the characteristic of UAVs is thatthey have a limited payload and lesser flight time [115].These UAVs have to be empowered with complex as wellas high level of tasks so that they can be utilized in smartIIoT applications. In [114], Zhang et al. investigated theUAVs to perform image classification tasks and proposed FLbased multi-UAV systems in the exploration domain. Theimage classification task is completed using the distributedFL. Multiple UAVs in the application area are coordinatedwith the use of ground fusion centre (GFC). Results showthat the accuracy of image classification by UAV is high andcommunication cost is reduced relatively. The global model isupdated at the GFC using the weighted Zero Forcing transmitprecoding at every UAV.

To summarize, resource optimization deals with optimizingthe communication as well as computation resources requiredfor enabling the FL systems to locally compute, communicateand update the global parameters. When these are optimized,the learning cost of FL model is significantly reduced. Apart

Page 12: Fusion of Federated Learning and Industrial Internet of ...

12

TABLE IVSUMMARY OF RECENT RESEARCH CONTRIBUTIONS ON FL IN IIOT FOR DATA STORAGE-DATA MANAGEMENT-RESOURCE MANAGEMENT.

Ref. Methodology Research Focus Techniques Used Contributions Proof of Concept[87]–[89]

Data Storage

◦ Secure data storage◦ Dynamic data gathering◦ Reducing congestion◦ Reducing energy consump-tion

◦ Middleware based solution◦ SDN◦ Interplay of fog and cloudcomputing◦ latencyAware policy

◦ Retrieval feature tree is designed◦ Reduction in storage space◦ Carbon footprint of data centre is reduced◦ Multi-objective evolutionary algorithm isproposed

◦ Gas density in factory◦ Customer services and main-tenance

[90], [91] ◦ Effective data storage◦ Acquiring huge data◦ Provide intelligent manage-ment

◦ SOA based solution◦ Distributed database server◦ Data aggregation model

◦ FFBF service architecture is proposed◦ An IDMS framework is proposed

◦ Manufacturing sector◦ Healthcare◦ Wearable devices

[92] ◦ Collection of data from var-ious sources◦ Aggregation of data◦ Analysis of data

◦ Event Centric Model ◦ A data aggregation model is proposed◦ Correlation model is proposed

◦ Maintenance Sector◦ Interconnected ProductionLines

[94]–[96] ◦ To handle device failure◦ To provide interface for ex-change of information and data

◦ Architecture based solution◦ SDN◦ CPS◦ DRL

◦ Blockchain based DRL is proposed◦ CCPSA is proposed◦ Equipment utilization rate is increased

◦ Complex industrial applica-tions◦ Automated guided vehicles◦ Industrial robots

[98]

DataManagement

◦ To reduce the amount of datauploaded◦ To provide intelligence tomobile edge nodes

◦ DRL for optimization◦ FL for providing intelligence

◦ An “In-Edge AI” framework for improv-ing deployment◦ Deep Q-Learning for improving compu-tation in edges

Xender’s traces

[99] ◦ To provide knowledge fusionmechanism◦ To provide a model for trans-fer learning

FL along with imitation learn-ing

◦ Federated IL (FIL) is proposed◦ A transfer learning approach using FIL isproposed

Self-driving cars

[70] ◦ To provide higher securitywhile multiple parties try tocollaborate

DL and FL along withblockchain

◦ A data collaboration framework is pro-posed using FDL◦ A blockchain based secure mechanism forcomputation is proposed

◦ A private blockchain basedon Libra protocol◦ Raw data collected fromwearable device

[12] ◦ To share data across multipleparties without any leakage

FL and blockchain ◦ Differential private FL is proposed ◦ Reuters data set◦ 20 newsgroups data

[100] ◦ To optimize the structure ofthe model◦ To minimize the computa-tional cost◦ To minimize the globalmodel test errors

FL and multi-objective evolu-tionary algorithm

◦ Federated averaging algorithm is proposed◦ A modified version of SET algorithm isproposed

Independent IID and non-IIDdata set

[101] To provide better privacy inshared models

FL and MPC ◦ Two-phase MPC enabled FL is proposed◦ Committee election algorithm is proposed

◦ Manufacturing unit◦ Fault detection in electricalmachines

[102] To prevent exploitation of pri-vate data in sensitive applica-tions

FL along with AugmentedLearning

◦ Privacy enhanced FL is proposed forIndustrial AI◦ Aggregation decryption is utilized◦ Model aggregation algorithm is proposed

◦ MNIST data set◦ Healthcare and autopilot

[103] ◦ To provide intelligence to theedge network◦ To prevent malicious attacks

Ubiquitous communication,blockchain along with DRL

◦ Credit Differentiated transaction approvalscheme is proposed◦ DRL for optimized resource scheduling

Five BS each having comput-ing server and caching server

[104] To optimize hyper-parametersin ML model

PSO along with FL ◦ PSO based parameter setting frameworkis proposed◦ The number of hidden layers and neuronsare optimized

Smart city traffic with CityPulse EU FP7 data set

[107], [108]

ResourceManagement

◦ To optimize transmissionpower◦ To reuse wireless resources

Hierarchical FL ◦ Closed form expression for computingconvergence rate◦ Optimal transmission power◦ Minimized loss function

◦ Single BS scenario◦ Heterogeneous cellular net-work consisting of MBS andSBS

[109] To enable task offloading inedges

DL based FL ◦ Minimize cost due to delay, computationand energy consumption

Heterogeneous networks inedge computing

[110] To optimize resource in cogni-tive IoT

Dispersed FL Integer linear programming to minimizeoverall cost

Smart industry

[111] ◦ To evaluate ML model◦ To provide structured work-flows

IFL ◦ Model for adapting to diverse operatingconditions◦ Proposed IFL

◦ Synchronous framework◦ Centralized FL

[112] To mitigate unreliable commu-nication

Blockchain based FL Learning accuracy is balanced DTWN

[113] To provide dynamic resourceallocation

DFRL MAQL for dynamic slicing is proposed Multi-industrial IoT

[114] To perform image classifica-tion in exploration domain

Distributed FL ◦ GFC is deployed◦ Communication cost is reduced◦ Weighted zero forcing transmit precoding

UAVs

from these the energy consumed by the end devices andlocal learning rate are dependent on the capability of FLsystems’ CPU. Various research contributions discussed in thesection gives solutions for reducing the computation cost andimprovises the resource allocation process.

It is expected that IIoT would transform the way humanbeings live, work or play. The future of IIoT varies from basicautomation to factory automation, high level of connectivity to

wearable devices and smart applications. IIoT is ready to setour lives with large number of sensible things around us whichwould be networked with each other to avoid any accidents orabnormalities. In the near future, there will be a variety of ap-plications which would be a part of our daily activities. Therewill be huge volume of data acquired by networks, upload anddownload of data via communication channel would be high.To handle these data, various solutions are proposed by recent

Page 13: Fusion of Federated Learning and Industrial Internet of ...

13

Federated Learning

Data Storage SOA

Architecture Based

Enhanced Equipment Utilization

Dynamic Resource Provisioning

Acquiring Huge Data

Prevents Dos and DDoS Attacks

Intelligent Management

Failure rate reduced by 50%

Minimize Computational Time

Middleware Based

Congestion Reduction

Enhance Remote Data Control

Dynamic Data Gathering

Efficient Data Retrieval

Data Management Deep

Learning Based

Reduce uploaded data

Dynamic Resource Allocation

Improving Computation in

Edges

Blockchain Based

Improves Reliability

Prevent Malicious Attacks

Better Privacy in Shared Memory

Prevention of Data Leakages

Evolutionary Algorithm

Optimize Hyper-Parameter Optimize

Structural Model

Resource Management

Hierarchical FL

Minimized Loss Function

Cross Domain Sharing

Minimize Convergence Rate

Optimal Transmission Power

Fig. 4. Illustrative Usage of FL in IIoT for data/resource management.

researchers with respect to data storage, data management andresource management and is summarised in the Table IV. Theusage of FL for data/resource management and data storagein IIoT is illustrated in Fig. 4.

V. APPLICATIONS

IoT has been playing significant role in our day to day mod-ern life with applications almost in every vertical such as smartautomobile industry, smart healthcare, defence, smart cities,smart grid etc [116]. It provides high end smart services andproducts which can transform traditional technology into anfully automated technology. Typically, millions of smart sen-sors are producing bundles of data that needs to be managedspecifically privacy as per new GDPR rules. As a consequence,task-specific ML models need to be developed that can providesmart services and products with no compromise on privacy.

A. Federated Learning enabled IoT in Automotive Industry

FL is a ML approach that trains an algorithm in a decen-tralized manner such that models are updated with the help oflocal data without sharing this data over cloud servers keepingthe privacy of the data intact [117]. FL along with IoT hasbecome the game-changer for almost the last two decades inevery modern field such as smart automobile industry, smarthealthcare, defence, smart cities, smart grid etc [116]. Thesecollaborative techniques along with IoT have transformedthe way every individual interacts with his surroundings byproviding smart services which convert traditional technologyinto a fully automated technology [118]. Considering auto-mobile industry, the consumer end has been benefiting the

most as modern vehicular systems have been transformed intoan automated device which enables all the smart vehiclesto be interconnected and also occupants in these vehicles tobecome connected to each other. Vehicles that are connectedto IoT provides constant and simultaneous access to informa-tion to chauffeur and occupants while moving. However, thenumber of vehicles connected to each other over IoT keepson increasing [119] which thrives for smart car emergingtechnologies with new requirements such as secure, robustand roadside infrastructures etc, which transforms the originalconcept of vehicular ad-hoc networks (VANETs) [120], [121],[122] into a new concept called the FL enabled Internet ofVehicles (IoV) [123]. Manufacturing of such smart vehiclesin automotive industry must be capable of employing IIoTalong with ML techniques paving the way for smart industryexperience enabling efficient and sustainable production [124].Computers, people, and machines coming together to executeindustrial operations by capitalizing advanced data drivenalgorithms for groundbreaking business results is termed asIIoT. IIoT provides a way to better understand and gain insightinto operations and assets of the company with the help ofdata associated with the various machine sensors, middleware,software, and back-end cloud compute and memory systems.Therefore, it provides a way to transform business operationsand processes by using advanced analytics to query large datasets by using the results obtained as feedback. These gainsresults in optimized efficiency of productions, and therebyincreasing profits. This connectivity allows for data collection,exchange, and analysis, potentially facilitating improvementsin productivity and efficiency. IIoT applications usually requirerelatively small throughput per node, and capacity is not a

Page 14: Fusion of Federated Learning and Industrial Internet of ...

14

major issue. Instead, a large number of devices need to beconnected to the Internet at low cost, which have limitedhardware capabilities and have small batteries thus requiresan energy efficient, cost effective, reliability, and privacysolutions. The IIoT will eventually have a greater impacton economy as it will innovate infrastructures to such largerextents that it will revolutionize the traditional industry. Witha strong focus on M2M communication, big data, and ML, theIIoT enables industries and enterprises to have better efficiencyand reliability in their operations. As big and high volumes ofdata is involved in IIoT devices, it is impossible to collect dataover cloud centralized server [10]. As high number of devicesand machines are connected, classical cloud-based learningapproaches are impractical and inadequate. But with FMLenabled IIoT, privacy of the data is taken into considerationwhich further enhances the performance of IIoT devices andalso accelerates the learning processes of the data whichtechnique you are talking or building the base.

There are a number of techniques available to access andtrain data. One such approach is to access user data whichis scattered and collecting it from cloud for modeling andthen transfer this trained model to client devices for providingautomation of specific tasks. But the cost associated with thistype of approach is huge and also latency will be increasedwhich is not desirable [125]. It also involves some data privacyissues as sensitive data is shared on cloud which can beleaked. The IIoT devices are also becoming the main targets ofmalicious attacks leading to disastrous consequences. These at-tacks misleads the global models to predict undesirable outputs[126]. Various counter methods have been proposed such asensemble diversity [127], Purifying Variational AutoEncoder(PuVAE) [128] and other training methods such as adversarialtraining, but none of them proved to be satisfactory as theyfocus only on a particular type of attacks. Alternative to thesemethods is to train the IIoT data locally in isolation to otherdevices but with this approach, resources required will be veryhigh. On the other hand, with limited computation ability andlocal memory and insufficient data samples can result in fur-ther degrading the model. So to solve the issue of availabilityof IIoT device data in isolation and maintaining privacy ofthis data, a more advanced ML approach is developed whichis termed as Federated ML [20]. The concept of FL was firstintroduced by google in 2016 [129] which aimed to accessuser data distributed across various IIoT devices and build MLmodels exploiting this data without its leakage [130]. Privacyis one of the basic attributes of FL which requires securitymodels and analysis to provide meaningful privacy guarantees[131]. FL enabled smart automobile industry has variousautomation systems, IIoT devices for storing correspondingdata, control systems and various operating machines whichare connected to each other through the internet. FML modelsare deployed to edge devices which updates the model basedon demands and also check for any anomalies in production[132].

An FL based IIoT model is depicted in Fig. 5 and stepsfollowed are mentioned below.

1) Selection of client nodes: In a smart industry, variousclients are automation systems and operating machines

having its own processor and is connected to othermachines having M2M communication, the IIoT subsys-tems collecting data from data generating machines forlearning tasks and training of models [133]. These clientedge nodes are selected by central cloud server that isinvolved in model training. This big data associated withother smart industries of some other company can alsobe considered while training the global model. Based onclient selection, an initial central model is developed byevaluating local raw data of edge clients.

2) Model distribution: This initial model is then distributedto clients for the purpose of updating and training thismodel based on clients local raw data.

3) Uploading locally trained models: These initial modelsare no longer initial as they are trained by the clientsusing their local raw data. The updated model is thenuploaded to centralized server for federated averaging.

4) Aggregation: Having all the updated models by localclients, central task executioner averages all these mod-els to develop a new version of Global model whichis better in terms of efficiency and accuracy than theinitial global model. For averaging FedAvg [133] is thebest algorithm to solve this averaging problem yieldinga well trained Global model.

5) Model Testing and update: This aggregated global modelis then tested by central task executioner with the datathat belongs to other clients that have not participated inthis training. Analyzing the testing results, some param-eters are tuned to repeat the training process if needed,else the model is updated accordingly and distributed toall the clients.

The above process is repeated until a global model isachieved which solves the purpose with considerable accuracy[134]. Based on this, Bonawitz et al. [130] recommend that alllocal updates are aggregated securely and globally by usingtheir weighted average. In this way, efficiency in productioncan be achieved with low cost.

In smart automobile factory, automated machines and con-trol systems drives the raw material used for production bycommunicating with each other cooperatively. Smart manufac-turing enables factory managers to automatically collect andanalyze data to make more informed decisions and optimizeproduction in reduced costs. All the decisions depends uponharnessing of data and based on FL models, data will decidewhat to do and when to do a particular task.

1) Requirements: Following are the requirements for FLenabled IIoT in automotive industry for continuous monitoringand adaptation of system.

• Data management: First and foremost requirement isthe management and harnessing of big data. To supportcollaboration of all the FL clients (FLCs), it is requiredto publish metadata describing the organization and itscorresponding edge IIoT device. Based on this data, FLCscan provide information and data for collaborating withother selected FLCs. As private raw data is not sharedin FL, it enables privacy taking into consideration theprivacy policies of industry partners. Industrial FLCs

Page 15: Fusion of Federated Learning and Industrial Internet of ...

15

Automatic Machine Data Generation

M2M M2M

IIoT devices storing data and training models

Aggregation of models and continuous

updations Industry partner

Aggregation

1

2

5

Uploading

4Downloading

3

Federated global model

IIoT 1 IIoT 2

Fig. 5. A FL-enabled smart automobile industry manufacturing smart cars. FL models are being trained by edge IIoT devices which trains data generated byautomatic machines and the data fetched from industry partners. These models are regularly updated and sent to global cloud for aggregation and generatinga new global model which is reliable and efficient.

target to improve ML models using IIoT device data byupdating them regularly with its local data.

• M2M communications: These are required in automatedmachines so that they can send and receive data fromother machines and can work collaboratively resulting insmart manufacturing of vehicles. Machines communicatewith each other through various protocols in productionand manufacturing of smart vehicles. In case of M2Mcommunications, efficient routing protocols are requiredin dense and mobile environment [135], [136]. Thedriving force behind the FL-based IIoT is that smartmachines with M2M communications works better thanhumans with greater accuracies by consistently capturingand communicating data with each other.

• Quality of Information (QoI): Quality of informationmeans the information set that is received by globalmodel through aggregations and updations by local mod-els with their local data sets. As FLCs trains and evaluatesthe models based on its local data set, aggregated globalmodels have a diverse set of data with certain quality ofinformation. Furthermore, as different clients are operat-ing in the industry with less or almost no human inter-ventions [137], different approaches to collect data caninfluence QoI of generated data sets in edge IIoTs suchas free-of-error datasets, relevancy QoI, etc [138]. StoringQoI indicators along with existing industry metadata ofparticipating organizations can further enhance the abilityto build and update appropriate FL models.

• Scheduling Algorithms: These algorithms are required forcomputations and task offloading as FL plans can causeheavy loads on edge IIoT devices as training requires

large amount of data sets and this problem was identifiedin [139] and the need for scheduling algorithms. Thesealgorithms take care of any repetitions being done onolder data sets if any new data is to be processed andmodel should be trained on the basis of new data. Thescheduling device will give priority to this new data setand avoid any congestion.

2) Advantages: The FL enabled IIoT is proved beneficialin industry in the following ways:

• A FL approach which is dynamic and decentralized ap-proach, can be efficient in terms of design and allocationof resources by collaborative and simultaneous learningwith constant access of information from automated ma-chines and control systems releasing its data to IIoT edgedevices.

• FL enabled IoT is said to take care of privacy of thedata which the client has to constantly share with thetask executioner for generating and updating global modelthus mitigating the privacy issues by not providing accessto the raw training data. Privacy-protection techniquescan be applied to FL such as personalization techniques[26], FL enabled semi-supervised learning algorithms[140], threat models [24], mobile edge networks [22].By using large amount of data for devices with privacyprotection methods, FL can enable some impossible newapplications for traditional learning methods.

• FL approach enables smart manufacturing to take actionson the basis of global model and local model whichresults in reduced delay as compared to traditional tech-niques. The FL can be incorporated with other techniquessuch as Blockchain and edge computing to promote

Page 16: Fusion of Federated Learning and Industrial Internet of ...

16

various real-time systems for the IoT in Industries.3) Applications: FL enabled IoV is an autonomous system

which updates and trains global model in isolation, thusmaintaining the privacy of the user data. Existing IoT vehicularsystems are based on centralization of data in which eachvehicle uploads its data to a central cloud which puts alimit on amount on data being shared as there is a riskof data leakage. But, FL mitigates this issue and intelligentautonomous vehicular system could be developed which workson collaboration among vehicles. Thus it has many applica-tions such as intelligent traffic signal control, precise nav-igation systems, collaborative autonomous driving, collisionavoidance systems, vehicle platooning and many more.

Vehicle platooning system is very important application ofML based IoT systems. For building smart cities, autonomousvehicle platooning will play an important role in maintainingthe road safety and mitigates the problem of traffic conges-tion. FL enabled IoT along with vehicle-to-vehicle (V2V)communication will collect information from environment andtrains global model without any risk of data leakage, thusmaintaining the speed requirements and the V2V distance andprovide stability to vehicle’s control system. Modern Vehiclesare equipped with multiple sensors, and accessing data ofmultiple vehicles along with FL, improved platooning systemscan be achieved [141]. GPS, speed of vehicle, V2V distancedata can be accessed for predicting road safety, enabling abetter positioning systems and intelligent road systems [142].The camera of the vehicle and data associated with it can beaccessed for surveillance, any accidents etc. However, due tolarge data size and privacy concerns, it is challenging to uploadthe data on central cloud server. Therefore, FL could be apossible way to conduct distributed perception by processingthe data at the end vehicles and then integrating the results tomake the final decision.

B. Smart Health Care Industry

Healthcare industry is another area that we expect to benefitfrom the rising ML technique of FL. As the population isgrowing at alarming rate, there can be seen an alarmingpressure on hospital staff and quality of service being deliveredto patients by staff and medical practitioners. Thus, newefficient and reliable solutions are required from science andtechnology [143], [144].Because of fourth industrial revolu-tion, innovation has been achieved in manufacturing processwith increased efficiency in the supply chain [103]. In themedical field, smart Industry with IIoT and ML techniqueshas played vital role providing wide range of applications increating customised implants, tools and medical equipment.Smart healthcare industry improves the healthcare infrastruc-tures and biomedical systems keeping in view the safety ofpatients, their vitals and real time monitoring. Such automatedsystems manufactured by smart medical industry make smartdecisions through real-time communication and cooperationwith humans, machines, sensors, and so forth. [145]. Medi-cal Industry 4.0 manufactures smart devices which providesquality care to patients while addressing the issue of shortageof medical staff and beds especially in populated hospitals.

IIoT has been widely identified as a potential solution toalleviate the pressures on healthcare systems [146] by assistingmanufacturing and supply chain of medical devices. Remote-healthcare systems can monitor patients at home without anyneed of them visiting infirmary or hospitals. In essence, itcan improve access to medical resources, while reducing thepressure on the medical system, and it can allow people tobetter control their health. Health systems have improved a lotfrom the earlier times, thus increasing average life expectancyof the individuals. As a result, there is increased populationof elderly people living with chronic diseases, heart diseases,diabetes etc, thus placing huge demands of healthcare services.In order to address this issue, FL enabled IIoT systems aredeveloped which can learn patients medical data such as day-to-day health, disease symptoms, medical reports and vitalsand this data is shared with medical practitioners to subscribethem appropriate medication at their residence itself. There area lot of IIoT based healthcare systems that focus on particulardiseases and in-general monitoring of patients health. A systemfor generating rehabilitation plan based on patients symptomshas been developed [147] in which the individual’s condition iscompared with the symptoms of past patients from a databaseand studied their ailments and provided optimal treatments.IIoT provides anytime, anywhere and anything can be con-nected with each other with focus on M2M communication[116]. IIoT and IFL enable ML models on local raw data,stored in edge devices, to be updated to global model withoutactually sharing the local data which provides security tomanufacturing process and IIoT improves productivity [61].In this scheme, cloud server receives different parametersfrom local updated models trained on IIoT decentralized [148]edge devices and aggregates these local models to develop aglobal model [133], whose parameters are again sent to edgedevices for training based on raw data. This process is repeateduntil the global model achieves a well defined condition andaccuracy. But to successfully caters the challenges faced inindustry, FML approach needs to be adapted.

Thus FL along with IIoT is employed which keeps allthe sensitive data of automated systems and smart operatingmachines private in local edge devices. The machines aredigitally represented as assets [111] that are responsible forgeneration of data that is trained and evaluated by edgedevices.

The FL enabled smart healthcare industry model is depictedin Fig. 6 and following steps are followed:

1) Data Selection: Central cloud server needs an initial setof data to develop an initial global model. This dataset includes the data that is generated by assets whichdepicts automated machines. This data is accessed byedge devices which trains models based on ML. MLmodels depicts any anomalies present in the manufac-turing process of the smart medical devices. Due to somepotential differences in asset data, model updates maylead to negative knowledge transfer, thereby reducingmodel performance [149].

2) Model distribution and updation: This initial modelprepared by cloud is distributed to IIoT edge deviceswhich can update and train this model based on their

Page 17: Fusion of Federated Learning and Industrial Internet of ...

17

Automatic Machine Data Generation

M2M

Model training by local raw data

FL cohorts ensures optimal collaboration with respect

to model performance

1

2

5

Model M1

Aggregation of global model

Model M2

Model M3

Cloud Server

3

4

IIoT, Edge devices

FL cohort 1,2,3

Robots

Assets

Production Machines

M2M

Fig. 6. An FL-Enabled IoT healthcare system.

own local data. Due to the existence of well-trainedrobots, the ML model are constantly updated and theinformation is shared with other federated robots. Theseupdates are integrated asynchronously into the globalmodel to ultimately enhance the navigation skills ofall robots involved. In this way IFL can continuouslyupdates and reevaluate data similarity which is keyrequirement to ensure high model quality.

3) Aggregation: Algorithms such as FedAvg [133] is usedto solve this averaging problem yielding a well trainedGlobal model. It is decentralization approach that workwith non independent identical distribution (non-IID)data which requires sharing of the same model by allthe clients.

4) Federated Learning Cohorts: FL cohorts enables clientsor assets share their updates to those belonging to subsetof FL clients, who have submitted FL tasks that belongsto the same FL cohort. These updates are required toimprovise accuracy of their local models. In medicaldevice manufacturing industries, some assets requiressimilar conditions for their accurate working e.g., au-tomated production machines placed into environmentwith similar temperature, noise and other features con-sidered in the model prediction. Thus the IFL systemneeds to build FL cohorts.

5) Scheduling: A service scheduling machine and a virtualmanager (VM) is also required at the middle of IoTsubsystems and cloud. VM is responsible for collectingand storing sensor data of virtual sensors from automaticmachines and robots. Service scheduling machine is

responsible for resource management and scheduling ofseveral tasks with high inter-arrival rates and highly vari-ant run-times of IIoT subsystems with low parallelism,thus improving energy efficiency.

The above steps are repeated for developing an accurate globalmodel for developing medical devices in healthcare industry.As individuals data is not shared on the cloud making FL asa trustable way to train the models.

VI. CHALLENGES AND SOLUTIONS

The IIoT based systems are generally heterogeneous and thestructure is also too complex in nature. Though these systemsare advantageous, there are so many technical challenges inimplementing and deploying these types of applications in realtime. When these challenges are met by providing solutions,these systems will surely rule the world in near future.

A. Effective Data Management

The IIoT systems are highly heterogeneous in nature andthe data gathered using the data acquisition mechanisms orsensors and actuators are huge in volume and also at timesunstructured. Also in IIoT systems, the data gathered areusually streams of data at high velocity. Currently these hugestreams of data are stored in heterogeneous devices, gateways,edge servers, fog servers or cloud servers. The huge datagathered is raw and needs to be processed before taking anydecisions in real-time [150]. After processing, the processeddata needs to be transmitted to the required destination in anefficient manner. The data once archived needs to be retrievedwhen requirement arises and hence the data has to be available

Page 18: Fusion of Federated Learning and Industrial Internet of ...

18

at the right time for further analysis. While the data is archived,then arises the challenge for safe storage. Though IIoT is thebuzz term in the recent days, most of the industries are stillin a dilemma to implement these systems in real-time due tothe above mentioned challenges.

Solution: To cope up with these challenges, an efficient datamanagement model [151] has to be designed and utilized. Thedata model designed has to enable better storage mechanismswhile archiving as well as sharing the sensitive data. It shouldalso provide an efficient management scheme for handling thehuge volume of data like processing, analyzing, retrieving thedata at a higher speed,etc.

B. Heterogeneity and Interoperability

IIoT systems are typically comprised of heterogeneousmultiple vendor systems [152] such as machines, robots, IoTdevices, wearable sensors, actuators, networks which are eitherwired or wireless, 5G based cellular networks, broadbandnetworks, storage servers which may be cloud servers oredge servers. Integrating these multiple technologies to builda whole system is a challenging issue.

Solution: The researchers need to concentrate more ondeveloping resource sharing techniques [153], synchronizingtechniques, data sharing techniques and similar factors amongthe heterogeneous environment to improve the interoperability[154] and collaboration.

C. Server side attack

Preserving privacy of data and model is major issue indecentralized framework. Security mechanism must be appliedon Edge to end device communication and from edge tocollaborative model. Even in worst case, the model updatescan be tampered between end devices to collaborative engine.In FL, the update from end devices are sent to the cloud fortraining the global model. However there is a high chance ofstealing the updates from cloud by the malicious user whichresults in increase of inconsistency and noise data in the model.

Solution: Fully Homomorphic Encryption (FHE) is one ofthe best to overcome privacy threat challenge in the cloud. Alloperation can be encrypted using FHE except the activationfunctions. Even for activation function, higher degree polyno-mials or modified Chebyshev polynomial can be applied andthen can be encrypted using FHE. Recently Phong et al. [155]designed a model that balances the trade-off between dataprivacy and accuracy. It also ensures no information is leakedfrom the cloud server. The gradient updates are encrypted andis directly provided to neural networks where homomorphicfunction enables the computation among the other gradientupdates.

D. Optimization

As the end devices have limited memory and computationalrequirements, DL models must be optimized so that they canbe trained and deployed on IIoT or end devices efficiently.In terms of hardware optimization and model optimizationwould decreases computation time and energy consumption.

To summarize. Optimizing the DL and ML models is anessential part to execute the models at the edge devices whichremains a challenging task for the researchers even now. Theend device have only limited power and resource to train thelocal model as they are shared by the other devices. Practically,it is difficult to provide a static connection for all end devices.So, we have to reduce the computational complexity andcommunication overhead in FL. Therefore, there should be atradeoff between power consumption and resource allocationto train the model [130].

Solution: The main solution for the above challenge is touse MPC in FL by using secure aggregation protocol. So thebest possible approach would be to use the FHE based MPCthat can be implemented with limited rounds. Generally FHEprovides the confidentiality and privacy of the updates butthreshold FHE can reduce the communication overhead.

E. Inference Attack

Inference attack is an attack where the hacker can infersensitive information by the results of authorized query. In-formation can be inferred directly from the model or industryfrom the frequent queries. Generally the updates are sent to theserver by the local node which indirectly has the features ofthe models. Some of the privacy information or identificationof the user can also be inferred from the model updates. In-ference attacks can be inferring class representatives, inferringmembership, inferring properties and inferring training inputsand labels.There are two types of inference attack namelyidentification attack where the users personal information isleaked and another type of attack is matching attack whichcompares the two model updates [156].

Solution: The best way is to utilize differential policy byadding noise to the information such that the unauthorizeduser cannot distinguish the original information from noise.Therefore there should be a balance between the privacyof data and accuracy of the model. This can be done bychoosing the parameter to balance the above two parameters.Orekondy [157] stated that domain specific data augmentationcan provide effective results with minimal impact.

F. Client side Attack

In general, privacy security mechanisms are computationallyexpensive techniques that results delay in response. Securityissue on server side and client side remains a challenging taskeven the data communication is restricted. Some mechanismsnamely DP and homomorphic encryption technique controlsthe amount of data to be shared on the cloud. Model poisoningis one of the dangerous attack which introduces backdoor func-tionality [158] from the client side and poison the client model.Some of the popular backdoor attack model are label flippingattack [159] and Sybil attacks [160]. Based on the occurrenceof their scenario, the complexity of the attack differs. Oneof the client device is compromised and the local model istrained using the backdoor data with their new techniques andsubmits the resulting model. Finally this poisoned client modelis aggregated with other client model and replace the globalmodel with the backdoor model. Another way of poisoning

Page 19: Fusion of Federated Learning and Industrial Internet of ...

19

the model is the adversary can inject bad data into the modelstraining data.

Solution: To prevent backdoor attack combined anomalydetection algorithm, participant level DP and Byzantine-tolerant gradient descent can be used. [130] discussed aboutthe secure aggregation as the updates from each end deviceare not visible to the aggregator. There is no one singlesolution to overcome the model poisoning. So it is difficult todesign a best solution for various type poisoning techniques.Furthermore, various solution can be integrated into an auto-matic predictable model which is an open question to proceedresearch in future directions. Another recommended approachis participant level DP. More specifically, FedAvg one of theDP algorithm trains deep models on user partitioned data andanother feature is providing required level of privacy for eachparticipant.

G. Data Leakage

The data generated by the smart devices are confidential,resulting in more security and privacy practices. Data leakageis one of the major barriers to data sharing in the distributedchannel. Researchers are more focused on data leakage assensitive data breaches can lead to increased risk factorslike financial loss, online vandalism, future security costs.Providing secure, smart data transmission in a distributedenvironment is a major concern.

Solution: Integrating FL and blockchain into IIoT canadequately improve data stabilization and data quality. Luet al. [12] proposed a blockchain-based FL model that en-ables data privacy in IIoT applications. The research aims toenhance trust management while transferring sensitive datafrom source to destination, to provide a secure blockchain-based data-sharing architecture in a distributed environment.The researchers pay special focus by establishing a blockchainenvironment to resist data leaks and inhibit full access controlof it’s owner. During this process, FL models are incorporatedto conceptualize data models that often share data modelsin the communication network. However, smart mechanismsfor enhancing the data utility and expanding the amount ofresources for effective data sharing in a distributed setupshould be incorporated.

H. Data Privacy

Data privacy is the primary concern in IIoT, as most ofthe data will be sensitive and confidential, so it is essential tokeep these data secure. Data protection techniques are used toidentify unauthorized users and malicious attacks to overcomethese challenges. Data privacy is one of the key issue whenFL is used to train static streams and data streams. In general,privacy can be classified as global and local privacy. The modelupdates generated in each iteration on all the devices are notavailable to untrusted third parties other than the centralizedserver, whereas in local privacy, the updates are private to theserver.

Solution: In recent work [58], a PriModChain frameworkis proposed for the implementation of trust and data privacyin IIoT. The proposed model implemented data privacy with

a view to enhance the security and safety of the IIoT. TheSmart Contract helps in providing transparency within theIIoT framework whenever a communication link is formulatedbetween the distributed bodies and the central authorities.An interplanetary file system is used to achieve optimumlatency, immutability, efficient data archiving and secure in-formation sharing between the P2P network. Furthermore,the framework does not achieve optimum latency and thuscurtails its use in the IIoT system. Additionally, to improveresource optimization and data privacy for smart industries,the work in [110] unveils a dispersed federated approach forresource optimization and reliability. During this process, theproposed model aims to minimize FL costs, while optimizationis achieved by a decomposing and relaxation-based algorithm.Later [110] indicated the convergence of sub-problems andremedied them using a stochastic optimization technique.

I. Communication Overhead

Training in FL is a recursive process that involves intensivesharing of parameters among clients and servers. Conse-quently, to adopt FL to resource-constrained IIoT contexts,minimizing communication overhead is a key issue requiringin-depth investigation. However, some work has been under-taken to mitigate the use of resources in FL by lowering theload distribution factor between source and destination.

Solution: The work in [161] presented gradient-descent FLthat involves local updates and global convergence measures.In order to accomplish the desired exchange between localupdate and global accumulation, a control algorithm is intro-duced to significantly reduce the loss function in the context ofbudgetary resource concerns. Meanwhile, researchers in [162]exemplify ineffective client training with fairly low resources.Besides, the lack of global accumulations could reduce modelperformance and convergence. The allocation of resources,data training and data authentication must be considered inorder to improve the communication cost in FL.

J. Data Sharing in Horizontal Federated Learning

Industrial awareness is commonly observed using DL anddata mining techniques related to IoT data. However somedata is not easy to acquire one factory’s information, as thereare still few samples. If different alliance factories are ableto gather their information together a lot more informationcould be extracted. In addition, security of information is a keyissue for these factories. Traditional matrix-based techniquescan ensure information security within a factory, but will notfacilitate information sharing between factories, and therefore,due to lack of correlation, mining efficiency is weak.

Solution: To achieve these objectives, a new FederatedTensor Mining (FTM) approach is introduced in [57] to fed-erate multisource data for tensor-based mining while ensuringsecurity. The key contribution of FTM is that every factoryonly needs to share its ciphertext data for security issue,and these ciphertexts are adequate for tensor-based knowledgemining due to its homomorphic attribution. Real-data-drivensimulations demonstrate that FTM not only mines the sameknowledge compared with the plaintext mining, but also is

Page 20: Fusion of Federated Learning and Industrial Internet of ...

20

enabled to defend the attacks from distributed eavesdroppersand centralized hackers.

K. Anomaly Detection

Effective anomaly detection techniques based on sensingbig data become extremely important to ensure stability forindustrial applications. Since the failure of edge devices (i.e.anomalies) has a serious impact on the production processin IIoT, the relevant and reliable detection of anomalies isbecoming extremely important. In addition, the data collectedby the edge device contains tremendous private information,which is challenging as user privacy has become a seriousconcern.

Solution: To address this issue, Liu et al. [14] developeda communication-efficient FL-based deep anomaly detectionmodel for IIoT data sensing. During this process, a FLmodel was first introduced to allow decentralised edge devicesto train an anomaly detection model, which could enhanceits classification accuracy. In the second step, an AttentionMechanism-based Convolutional Neural Network-Long ShortTerm Memory (AMCNN-LSTM) approach is formulated foraccurate detection of anomalies. The AMCNN-LSTM methoduses attention mechanism-based CNN units to identify relevantfine-grained features, to prevent memory loss and gradientdispersion risks. In addition, the proposed model retains themerits of the LSTM unit in anticipating time series informa-tion. In the third step, a gradient compression technique toimprove communication process is proposed.

In order to solve these problems, an innovative FL-baseddeep anomaly monitoring system is proposed for smart edgedevices. Liu et al. [163] proposed a CNN-LSTM FL-basedframework for detecting anomalies in IIoT applications. Toimprove the data transmission efficacy of the presented frame-work, a gradient compression technique based on top-k selec-tion has been used to minimise network overhead.

L. Data Integrity

Blockchain technology is used in IIoT-based FL to providedata integrity in order to obtain adequate customer informa-tion and training computing capabilities. Blockchain-based FLsystems face some of the challenges in the design phase toaddress the barriers of data heterogeneity in IIoT during earlierfault detection. However, existing security mechanisms do notfocus on designing the effective framework and blockchain’sreliability issue.

Solution: In [13], a new FL-based blockchain architectureis proposed to detect failures in IIoT applications. During thisprocess, each client server continuously builds a Merkle treewhere each leaf node signifies the client data and saves theroot tree to the blockchain. An on-chain system is designed tomeasure each client’s impact, based on the size of client dataused in local model training.

M. Data and Model Attack

Even though data sharing is minimal in FL, still there can beseveral attacks in the framework at different levels. In general

attacks can happen at any stage starting from the basic nodelevel to the server level. They are as follows:

1) Data Attack- Attack in local data at the end device2) Model Attack- Attack that happens in local trained

model and global model in centralised serverData based attack: Changing the data or labels used in

the training data falls under this category of the attack. Dataattack can be a clean label and dirty label attack. When theunauthorized person cannot change the label of data as thiswould lead to misclassification of data is known as clean labelattack. In contrast in dirty label attack, the intruders attemptto include some number of samples which would mislead theML process.

Model based attack: Changing the model updates com-municated would cause the decrease in model efficiency.Hacking the model trained in local node results in leaking useridentification data globally. Aggregated global model presentin the centralized server is also prone to attack. Sometimesthe model are targeted for misclassification by changing thetraining process. Even though DNN-based approaches havebecome more popular in IIoT applications, they are currentlyexperiencing some disadvantages, such as massive numbersof adversarial attacks. How to build DNNs that are immuneto different types of attacks are becoming a major bottleneckwhen deploying IIoT safety applications.

Solution: In order to provide solutions to these kinds ofproblems, Song et al. [164] developed federated defensivesystem for cloud-based IIoT architectures. The revised FLmodel and the implemented adversarial learning loss function,the proposed strategy can metabolise DNNs to successfullyhandle existing adversarial threats while guaranteeing privacyprotection among various IIoT devices. The obtained resultswere compared with two popular benchmarks, and the resultsreveal that the improved method can not only increase overallprotection against various existing adversarial attacks, but canalso accurately detect the improper behaviour of DNN inducedby new attacks.

VII. CONCLUSION

The new paradigm of integrating FL on IIoT data wassurveyed and several approaches and techniques associatedwith it are introduced. In specific, we highlighted the char-acteristics and benefits of IIoT in terms of FL and distributivelearning. The motivation behind the integration of IIoT withFL is also discussed. Subsequently, several privacy preservingML/DL and blockchain models employed in FL and IIoT arepresented. Then, survey on algorithms to handle heterogenousdata is summarized. Contribution of researchers in the field ofFL with data and resource management followed by severalchallenges and solutions are discussed. Reviewing automobileand healthcare applications on IIoT with FL would providethe researchers to understand the principal components ofIIoT smart devices. We also identified various challenges,solutions and future research direction in path of FL for IIoT.In summary, the advantages of extending edge intelligenceto intelligent edge is to reduce the communication cost,computing power and energy consumption. Also, this paper

Page 21: Fusion of Federated Learning and Industrial Internet of ...

21

would enhance the readers to consolidate and combine theinformation with respect to IIoT, FL and privacy preservingmechanism, data management and storage mechanisms undera common new paradigm of Industry 4.0.

In future, researchers should focus on developing algorithmsand methodologies to impose privacy constraints across edgedevice and data level. The privacy preserving model for FLshould be designed to learn multiple geographically distributedtraining data sets without exploiting the sensitive informationof the user. The existing architecture violates the usage ofshort packets for low latency communication. Therefore newgeneration of wireless systems such as 6G cellular networkscan be integrated with FL enabled IIoT networks to providemore importance for privacy preservation.

REFERENCES

[1] J. Joung, “Machine learning-based antenna selection in wireless com-munications,” IEEE Communications Letters, vol. 20, no. 11, pp. 2241–2244, 2016.

[2] H. Li, K. Ota, and M. Dong, “Learning IoT in edge: Deep learning forthe Internet of things with edge computing,” IEEE network, vol. 32,no. 1, pp. 96–101, 2018.

[3] N. C. Luong, D. T. Hoang, S. Gong, D. Niyato, P. Wang, Y.-C.Liang, and D. I. Kim, “Applications of deep reinforcement learningin communications and networking: A survey,” IEEE CommunicationsSurveys & Tutorials, vol. 21, no. 4, pp. 3133–3174, 2019.

[4] Q.-V. Pham, F. Fang, V. N. Ha, M. J. Piran, M. Le, L. B. Le, W.-J.Hwang, and Z. Ding, “A survey of multi-access edge computing in5G and beyond: Fundamentals, technology integration, and state-of-the-art,” IEEE Access, vol. 8, pp. 116 974–117 017, 2020.

[5] J. Konecny, H. B. McMahan, F. X. Yu, P. Richtarik, A. T. Suresh, andD. Bacon, “Federated learning: Strategies for improving communica-tion efficiency,” arXiv preprint arXiv:1610.05492, 2016.

[6] T. Yang, G. Andrew, H. Eichner, H. Sun, W. Li, N. Kong, D. Ram-age, and F. Beaufays, “Applied federated learning: Improving googlekeyboard query suggestions,” arXiv preprint arXiv:1812.02903, 2018.

[7] J. Xu and F. Wang, “Federated learning for healthcare informatics,”arXiv preprint arXiv:1911.06270, 2019.

[8] J. C. Jiang, B. Kantarci, S. Oktug, and T. Soyata, “Federated learningin smart city sensing: Challenges and opportunities,” Sensors, vol. 20,no. 21, p. 6230, 2020.

[9] M. J. Sheller, B. Edwards, G. A. Reina, J. Martin, S. Pati, A. Kotrotsou,M. Milchenko, W. Xu, D. Marcus, R. R. Colen et al., “Federated learn-ing in medicine: facilitating multi-institutional collaborations withoutsharing patient data,” Scientific reports, vol. 10, no. 1, pp. 1–12, 2020.

[10] S. Niknam, H. S. Dhillon, and J. H. Reed, “Federated learning forwireless communications: Motivation, opportunities, and challenges,”IEEE Communications Magazine, vol. 58, no. 6, pp. 46–51, 2020.

[11] Q.-V. Pham, N. T. Nguyen, T. Huynh-The, L. B. Le, K. Lee, and W.-J.Hwang, “Intelligent radio signal processing: A contemporary survey,”arXiv preprint arXiv:2008.08264, 2020.

[12] Y. Lu, X. Huang, Y. Dai, S. Maharjan, and Y. Zhang, “Blockchainand federated learning for privacy-preserved data sharing in industrialIoT,” IEEE Transactions on Industrial Informatics, vol. 16, no. 6, pp.4177–4186, 2019.

[13] W. Zhang, Q. Lu, Q. Yu, Z. Li, Y. Liu, S. K. Lo, S. Chen, X. Xu,and L. Zhu, “Blockchain-based federated learning for device failuredetection in industrial IoT,” IEEE Internet of Things Journal, 2020.

[14] Y. Liu, S. Garg, J. Nie, Y. Zhang, Z. Xiong, J. Kang, and M. S. Hossain,“Deep anomaly detection for time-series data in industrial IoT: Acommunication-efficient on-device federated learning approach,” IEEEInternet of Things Journal, 2020.

[15] L. Da Xu, W. He, and S. Li, “Internet of things in industries: A survey,”IEEE Transactions on industrial informatics, vol. 10, no. 4, pp. 2233–2243, 2014.

[16] Y. Lu, “Industry 4.0: a survey on technologies, applications and openresearch issues,” Journal of industrial information integration, vol. 6,pp. 1–10, 2017.

[17] W. Z. Khan, M. Rehman, H. M. Zangoti, M. K. Afzal, N. Armi, andK. Salah, “Industrial internet of things: Recent advances, enabling tech-nologies and open challenges,” Computers & Electrical Engineering,vol. 81, p. 106522, 2020.

[18] M. H. ur Rehman, I. Yaqoob, K. Salah, M. Imran, P. P. Jayaraman,and C. Perera, “The role of big data analytics in industrial internet ofthings,” Future Generation Computer Systems, vol. 99, pp. 247–259,2019.

[19] K. Tange, M. De Donno, X. Fafoutis, and N. Dragoni, “A systematicsurvey of industrial internet of things security: Requirements and fogcomputing opportunities,” IEEE Communications Surveys & Tutorials,vol. 22, no. 4, pp. 2489–2520, 2020.

[20] Q. Yang, Y. Liu, T. Chen, and Y. Tong, “Federated machine learning:Concept and applications,” ACM Transactions on Intelligent Systemsand Technology (TIST), vol. 10, no. 2, pp. 1–19, 2019.

[21] T. Li, A. K. Sahu, A. Talwalkar, and V. Smith, “Federated learning:Challenges, methods, and future directions,” IEEE Signal ProcessingMagazine, vol. 37, no. 3, pp. 50–60, 2020.

[22] W. Y. B. Lim, N. C. Luong, D. T. Hoang, Y. Jiao, Y.-C. Liang, Q. Yang,D. Niyato, and C. Miao, “Federated learning in mobile edge networks:A comprehensive survey,” IEEE Communications Surveys & Tutorials,vol. 22, no. 3, pp. 2031–2063, 2020.

[23] V. Mothukuri, R. M. Parizi, S. Pouriyeh, Y. Huang, A. Dehghantanha,and G. Srivastava, “A survey on security and privacy of federatedlearning,” Future Generation Computer Systems, vol. 115, pp. 619–640, 2020.

[24] L. Lyu, H. Yu, and Q. Yang, “Threats to federated learning: A survey,”arXiv preprint arXiv:2003.02133, 2020.

[25] M. Aledhari, R. Razzak, R. M. Parizi, and F. Saeed, “Federated learn-ing: A survey on enabling technologies, protocols, and applications,”IEEE Access, vol. 8, pp. 140 699–140 725, 2020.

[26] V. Kulkarni, M. Kulkarni, and A. Pant, “Survey of personalizationtechniques for federated learning,” arXiv preprint arXiv:2003.08673,2020.

[27] Y. Liu, X. Yuan, Z. Xiong, J. Kang, X. Wang, and D. Niyato,“Federated learning for 6G communications: Challenges, methods, andfuture directions,” China Communications, vol. 17, no. 9, pp. 105–118,2020.

[28] Z. Zhao, C. Feng, H. H. Yang, and X. Luo, “Federated-learning-enabled intelligent fog radio access networks: Fundamental theory, keytechniques, and future trends,” IEEE Wireless Communications, vol. 27,no. 2, pp. 22–28, 2020.

[29] J. Kang, Z. Xiong, D. Niyato, Y. Zou, Y. Zhang, and M. Guizani,“Reliable federated learning for mobile networks,” IEEE WirelessCommunications, vol. 27, no. 2, pp. 72–80, 2020.

[30] K. Yang, Y. Shi, Y. Zhou, Z. Yang, L. Fu, and W. Chen, “Federatedmachine learning for intelligent IoT via reconfigurable intelligentsurface,” IEEE Network, vol. 34, no. 5, pp. 16–22, 2020.

[31] L. U. Khan, S. R. Pandey, N. H. Tran, W. Saad, Z. Han, M. N. Nguyen,and C. S. Hong, “Federated learning for edge networks: Resource opti-mization and incentive mechanism,” IEEE Communications Magazine,vol. 58, no. 10, pp. 88–93, 2020.

[32] L. U. Khan, W. Saad, Z. Han, E. Hossain, and C. S. Hong, “Federatedlearning for internet of things: Recent advances, taxonomy, and openchallenges,” arXiv preprint arXiv:2009.13012, 2020.

[33] S. Bhattacharya, T. R. Gadekallu, M. Alazab, and P. K. R. Maddikunta,“A review on deep learning for future smart cities,” Internet TechnologyLetters, p. e187, 2020.

[34] X. Wu, J. Zhang, and F.-Y. Wang, “Stability-based generalization anal-ysis of distributed learning algorithms for big data,” IEEE Transactionson Neural Networks and Learning Systems, vol. 31, no. 3, pp. 801–812,2019.

[35] J. Zhang and O. Simeone, “LAGC: Lazily aggregated gradient codingfor straggler-tolerant and communication-efficient distributed learning,”IEEE Transactions on Neural Networks and Learning Systems, pp. 1–13, 2020.

[36] W. Du, A. Li, P. Zhou, Z. Xu, X. Wang, H. Jiang, and D. Wu,“Approximate to be great: Communication efficient and privacy-preserving large-scale distributed deep learning in internet of things,”IEEE Internet of Things Journal, vol. 7, no. 12, pp. 11 678–11 692,2020.

[37] M. Rossi, M. Centenaro, A. Ba, S. Eleuch, T. Erseghe, and M. Zorzi,“Distributed learning algorithms for optimal data routing in IoT net-works,” IEEE Transactions on Signal and Information Processing overNetworks, vol. 6, pp. 179–195, 2020.

[38] K. Dev, S. A. Khowaja, A. Jaiswal, A. S. Bist, V. Saini, and S. Bhatia,“Triage of potential COVID-19 patients from chest X-ray images usinghierarchical convolutional networks,” 2020.

[39] G. Raja, Y. Manaswini, G. D. Vivekanandan, H. Sampath, K. Dev,and A. K. Bashir, “AI-powered blockchain-a decentralized secure

Page 22: Fusion of Federated Learning and Industrial Internet of ...

22

multiparty computation protocol for IoV,” in IEEE INFOCOM 2020-IEEE Conference on Computer Communications Workshops (INFO-COM WKSHPS). IEEE, 2020, pp. 865–870.

[40] M. Bogowicz, A. Jochems, T. M. Deist, S. Tanadini-Lang, S. H. Huang,B. Chan, J. N. Waldron, S. Bratman, B. O’Sullivan, O. Riesterer et al.,“Privacy-preserving distributed learning of radiomics to predict overallsurvival and HPV status in head and neck cancer,” Scientific reports,vol. 10, no. 1, pp. 1–10, 2020.

[41] Y. Xu, H. Liu, and Z. Long, “A distributed computing framework forwind speed big data forecasting on apache spark,” Sustainable EnergyTechnologies and Assessments, vol. 37, p. 100582, 2020.

[42] G. T. Reddy, M. P. K. Reddy, K. Lakshmanna, R. Kaluri, D. S. Rajput,G. Srivastava, and T. Baker, “Analysis of dimensionality reductiontechniques on big data,” IEEE Access, vol. 8, pp. 54 776–54 788, 2020.

[43] Z. Huang, R. Hu, Y. Guo, E. Chan-Tin, and Y. Gong, “DP-ADMM:ADMM-based distributed learning with differential privacy,” IEEETransactions on Information Forensics and Security, vol. 15, pp. 1002–1012, 2019.

[44] G. Xu, H. Li, S. Liu, K. Yang, and X. Lin, “Verifynet: Secureand verifiable federated learning,” IEEE Transactions on InformationForensics and Security, vol. 15, pp. 911–926, 2019.

[45] F. Sattler, S. Wiedemann, K.-R. Muller, and W. Samek, “Robust andcommunication-efficient federated learning from non-IID data,” IEEEtransactions on neural networks and learning systems, vol. 31, no. 9,pp. 3400–3413, 2019.

[46] P. Mathur, “Overview of IoT and IIoT,” in IoT Machine LearningApplications in Telecom, Energy, and Agriculture. Springer, 2020,pp. 19–43.

[47] F. Farivar, M. S. Haghighi, A. Jolfaei, and M. Alazab, “Artificialintelligence for detection, estimation, and compensation of maliciousattacks in nonlinear cyber-physical systems and industrial IoT,” IEEEtransactions on industrial informatics, vol. 16, no. 4, pp. 2716–2725,2019.

[48] H. Liao, Z. Zhou, X. Zhao, L. Zhang, S. Mumtaz, A. Jolfaei, S. H.Ahmed, and A. K. Bashir, “Learning-based context-aware resource al-location for edge-computing-empowered industrial IoT,” IEEE Internetof Things Journal, vol. 7, no. 5, pp. 4260–4277, 2019.

[49] R. Vinayakumar, M. Alazab, S. Srinivasan, Q.-V. Pham, S. K. Padan-nayil, and K. Simran, “A visualized botnet detection system baseddeep learning for the internet of things networks of smart cities,” IEEETransactions on Industry Applications, vol. 56, no. 4, pp. 4436–4456,2020.

[50] W. Zheng, C.-F. Lai, D. He, N. Kumar, and B. Chen, “Securestorage auditing with efficient key updates for cognitive industrial IoTenvironment,” IEEE Transactions on Industrial Informatics, 2020.

[51] H. Huang, S. Ding, L. Zhao, H. Huang, L. Chen, H. Gao, and S. H.Ahmed, “Real-time fault-detection for IIoT facilities using GBRBM-based DNN,” IEEE Internet of Things Journal, vol. 7, no. 7, pp. 5713–5722, 2019.

[52] S. Liu, C. Guo, F. Al-Turjman, K. Muhammad, and V. H. C. de Al-buquerque, “Reliability of response region: A novel mechanism invisual tracking by edge computing for IIoT environments,” MechanicalSystems and Signal Processing, vol. 138, p. 106537, 2020.

[53] G. Aceto, V. Persico, and A. Pescape, “A survey on informationand communication technologies for Industry 4.0: state-of-the-art, tax-onomies, perspectives, and challenges,” IEEE Communications Surveys& Tutorials, vol. 21, no. 4, pp. 3467–3501, 2019.

[54] V. Priya, I. S. Thaseen, T. R. Gadekallu, M. K. Aboudaif,and E. A. Nasr, “Robust attack detection approach for IIoTusing ensemble classifier,” Computers, Materials & Continua,vol. 66, no. 3, pp. 2457–2470, 2021. [Online]. Available: http://www.techscience.com/cmc/v66n3/41077

[55] L. Li, K. Ota, and M. Dong, “Deep learning for smart industry:Efficient manufacture inspection system with fog computing,” IEEETransactions on Industrial Informatics, vol. 14, no. 10, pp. 4665–4673,2018.

[56] J. Huang, L. Kong, G. Chen, M.-Y. Wu, X. Liu, and P. Zeng, “Towardssecure industrial IoT: Blockchain system with credit-based consensusmechanism,” IEEE Transactions on Industrial Informatics, vol. 15,no. 6, pp. 3680–3689, 2019.

[57] L. Kong, X.-Y. Liu, H. Sheng, P. Zeng, and G. Chen, “Federated tensormining for secure industrial internet of things,” IEEE Transactions onIndustrial Informatics, vol. 16, no. 3, pp. 2144–2153, 2019.

[58] P. C. M. Arachchige, P. Bertok, I. Khalil, D. Liu, S. Camtepe, andM. Atiquzzaman, “A trustworthy privacy preserving framework formachine learning in industrial IoT systems,” IEEE Transactions onIndustrial Informatics, vol. 16, no. 9, pp. 6092–6102, 2020.

[59] L. Kuang, L. T. Yang, J. Feng, and M. Dong, “Secure tensor decomposi-tion using fully homomorphic encryption scheme,” IEEE Transactionson Cloud Computing, vol. 6, no. 3, pp. 868–878, 2015.

[60] T. Huynh-The, C.-H. Hua, Q.-V. Pham, and D.-S. Kim, “MCNet: Anefficient CNN architecture for robust automatic modulation classifi-cation,” IEEE Communications Letters, vol. 24, no. 4, pp. 811–815,2020.

[61] R. Shokri and V. Shmatikov, “Privacy-preserving deep learning,” inProceedings of the 22nd ACM SIGSAC conference on computer andcommunications security, 2015, pp. 1310–1321.

[62] N. Deepa, Q.-V. Pham, D. C. Nguyen, S. Bhattacharya, T. R. Gadekallu,P. K. R. Maddikunta, F. Fang, P. N. Pathirana et al., “A surveyon blockchain for big data: Approaches, opportunities, and futuredirections,” arXiv preprint arXiv:2009.00858, 2020.

[63] S. Hakak, W. Z. Khan, G. A. Gilkar, B. Assiri, M. Alazab, S. Bhat-tacharya, and G. T. Reddy, “Recent advances in blockchain tech-nology: A survey on applications and challenges,” arXiv preprintarXiv:2009.05718, 2020.

[64] Y. E. Wang, G.-Y. Wei, and D. Brooks, “Benchmarking TPU, GPU, andCPU platforms for deep learning,” arXiv preprint arXiv:1907.10701,2019.

[65] H.-D. Cho, P. D. P. Engineer, K. Chung, and T. Kim, “Benefits of thebig. LITTLE architecture,” EETimes, Feb, 2012.

[66] X. Zhang, X. Chen, J. K. Liu, and Y. Xiang, “Deeppar and deepdpa:Privacy preserving and asynchronous deep learning for industrial IoT,”IEEE Transactions on Industrial Informatics, vol. 16, no. 3, pp. 2081–2090, 2019.

[67] X. Wang, C. Wang, X. Li, V. C. Leung, and T. Taleb, “Federateddeep reinforcement learning for internet of things with decentralizedcooperative edge caching,” IEEE Internet of Things Journal, vol. 7,no. 10, pp. 9441–9455, 2020.

[68] Y. Liu, J. James, J. Kang, D. Niyato, and S. Zhang, “Privacy-preservingtraffic flow prediction: A federated learning approach,” IEEE Internetof Things Journal, vol. 7, no. 8, pp. 7751–7763, 2020.

[69] Y. Chen, X. Sun, and Y. Jin, “Communication-efficient federated deeplearning with layerwise asynchronous model update and temporallyweighted aggregation,” IEEE Transactions on Neural Networks andLearning Systems, vol. 31, no. 10, pp. 4229–4238, 2019.

[70] B. Yin, H. Yin, Y. Wu, and Z. Jiang, “FDC: A secure federated deeplearning mechanism for data collaborations in the internet of things,”IEEE Internet of Things Journal, vol. 7, no. 7, pp. 6348–6359, 2020.

[71] H. Kim, J. Park, M. Bennis, and S.-L. Kim, “Blockchained on-devicefederated learning,” IEEE Communications Letters, vol. 24, no. 6, pp.1279–1283, 2019.

[72] Y. Qu, L. Gao, T. H. Luan, Y. Xiang, S. Yu, B. Li, and G. Zheng,“Decentralized privacy using blockchain-enabled federated learning inFog Computing,” IEEE Internet of Things Journal, vol. 7, no. 6, pp.5171–5183, 2020.

[73] S. R. Pokhrel and J. Choi, “Federated learning with blockchain forautonomous vehicles: Analysis and design challenges,” IEEE Transac-tions on Communications, vol. 68, no. 8, pp. 4734–4746, 2020.

[74] Z. Li, J. Liu, J. Hao, H. Wang, and M. Xian, “CrowdSFL: A securecrowd computing framework based on blockchain and federated learn-ing,” Electronics, vol. 9, no. 5, p. 773, 2020.

[75] G. Hua, L. Zhu, J. Wu, C. Shen, L. Zhou, and Q. Lin, “Blockchain-based federal learning for intelligent control in heavy haul railway,”IEEE Access, vol. 8, pp. 176 830–176 839, 2020.

[76] P. K. Sharma, J. H. Park, and K. Cho, “Blockchain and federatedlearning-based distributed computing defence framework for sustain-able society,” Sustainable Cities and Society, vol. 59, p. 102220, 2020.

[77] W. Guo, I. Kotsia, and I. Patras, “Tensor learning for regression,” IEEETransactions on Image Processing, vol. 21, no. 2, pp. 816–827, 2011.

[78] Z. Lai, W. K. Wong, Y. Xu, C. Zhao, and M. Sun, “Sparse alignmentfor robust tensor learning,” IEEE transactions on neural networks andlearning systems, vol. 25, no. 10, pp. 1779–1792, 2014.

[79] J. Feng, L. T. Yang, X. Liu, and R. Zhang, “Privacy-preserving tensoranalysis and processing models for wireless internet of things,” IEEEWireless Communications, vol. 25, no. 6, pp. 98–103, 2018.

[80] S. P. RM, S. Bhattacharya, P. K. R. Maddikunta, S. R. K. Somayaji,K. Lakshmanna, R. Kaluri, A. Hussien, and T. R. Gadekallu, “Loadbalancing of energy cloud using wind driven and firefly algorithms ininternet of everything,” Journal of Parallel and Distributed Computing,vol. 142, pp. 16–26, 2020.

[81] B. Babayigit and H. Sattuf, “An IIoT and web-based low-cost SCADAsystem for industrial automation,” in 2019 11th International Confer-ence on Electrical and Electronics Engineering (ELECO). IEEE, 2019,pp. 890–894.

Page 23: Fusion of Federated Learning and Industrial Internet of ...

23

[82] S. Leminen, M. Rajahonka, R. Wendelin, and M. Westerlund, “In-dustrial internet of things business models in the machine-to-machinecontext,” Industrial Marketing Management, vol. 84, pp. 298–311,2020.

[83] S. Munirathinam, “Industry 4.0: Industrial internet of things (IIoT),” inAdvances in Computers. Elsevier, 2020, vol. 117, no. 1, pp. 129–164.

[84] T. Reddy, S. P. RM, M. Parimala, C. L. Chowdhary, S. Hakak, W. Z.Khan et al., “A deep neural networks based model for uninterruptedmarine environment monitoring,” Computer Communications, vol. 157,pp. 64–75, 2020.

[85] M. Parimala, R. Swarna Priya, M. Praveen Kumar Reddy,C. Lal Chowdhary, R. Kumar Poluru, and S. Khan, “Spatiotemporal-based sentiment analysis on tweets for risk assessment of event usingdeep learning approach,” Software: Practice and Experience, 2020.

[86] Q.-V. Pham, H. T. Nguyen, Z. Han, and W.-J. Hwang, “Coalitionalgames for computation offloading in NOMA-enabled multi-access edgecomputing,” IEEE Transactions on Vehicular Technology, vol. 69, no. 2,pp. 1982–1993, 2020.

[87] J.-S. Fu, Y. Liu, H.-C. Chao, B. K. Bhargava, and Z.-J. Zhang,“Secure data storage and searching for industrial IoT by integratingfog computing and cloud computing,” IEEE Transactions on IndustrialInformatics, vol. 14, no. 10, pp. 4519–4528, 2018.

[88] P. Borylo, A. Lason, J. Rzasa, A. Szymanski, and A. Jajszczyk,“Energy-aware fog and cloud interplay supported by wide area soft-ware defined networking,” in 2016 IEEE International Conference onCommunications (ICC). IEEE, 2016, pp. 1–7.

[89] K. Kaur, S. Garg, G. S. Aujla, N. Kumar, J. J. Rodrigues, andM. Guizani, “Edge computing in the industrial internet of thingsenvironment: Software-defined-networks-based edge-cloud interplay,”IEEE communications magazine, vol. 56, no. 2, pp. 44–51, 2018.

[90] A. Singh, S. Garg, K. Kaur, S. Batra, N. Kumar, and K.-K. R. Choo,“Fuzzy-folded bloom filter-as-a-service for big data storage in thecloud,” IEEE Transactions on Industrial Informatics, vol. 15, no. 4,pp. 2338–2348, 2018.

[91] M. Saqlain, M. Piao, Y. Shim, and J. Y. Lee, “Framework of an IoT-based industrial data management for smart manufacturing,” Journalof Sensor and Actuator Networks, vol. 8, no. 2, p. 25, 2019.

[92] S. D. Anton, D. Fraunholz, J. Zemitis, F. Pohl, and H. D. Schot-ten, “Highly scalable and flexible model for effective aggregationof context-based data in generic IIoT scenarios,” arXiv preprintarXiv:1906.03064, 2019.

[93] S. Du, B. Liu, H. Ma, G. Wu, and P. Wu, “IIoT -based intelligentcontrol and management system for motorcycle endurance test,” IEEEAccess, vol. 6, pp. 30 567–30 576, 2018.

[94] J. Wan, S. Tang, Z. Shu, D. Li, S. Wang, M. Imran, and A. V. Vasilakos,“Software-defined industrial internet of things in the context of Industry4.0,” IEEE Sensors Journal, vol. 16, no. 20, pp. 7373–7380, 2016.

[95] Z. Shu, J. Wan, D. Zhang, and D. Li, “Cloud-integrated cyber-physicalsystems for complex industrial applications,” Mobile Networks andApplications, vol. 21, no. 5, pp. 865–878, 2016.

[96] C. H. Liu, Q. Lin, and S. Wen, “Blockchain-enabled data collectionand sharing for industrial IoT with deep reinforcement learning,” IEEETransactions on Industrial Informatics, vol. 15, no. 6, pp. 3516–3526,2018.

[97] P. D. U. Coronado, R. Lynn, W. Louhichi, M. Parto, E. Wescoat, andT. Kurfess, “Part data integration in the shop floor digital twin: Mobileand cloud technologies to enable a manufacturing execution system,”Journal of Manufacturing Systems, vol. 48, pp. 25–33, 2018.

[98] X. Wang, Y. Han, C. Wang, Q. Zhao, X. Chen, and M. Chen,“In-edge AI: Intelligentizing mobile edge computing, caching andcommunication by federated learning,” IEEE Network, vol. 33, no. 5,pp. 156–165, 2019.

[99] B. Liu, L. Wang, M. Liu, and C.-Z. Xu, “Federated imitation learning:A novel framework for cloud robotic systems with heterogeneoussensor data,” IEEE Robotics and Automation Letters, vol. 5, no. 2,pp. 3509–3516, 2020.

[100] H. Zhu and Y. Jin, “Multi-objective evolutionary federated learning,”IEEE transactions on neural networks and learning systems, vol. 31,no. 4, pp. 1310–1322, 2019.

[101] R. Kanagavelu, Z. Li, J. Samsudin, Y. Yang, F. Yang, R. S. M. Goh,M. Cheah, P. Wiwatphonthana, K. Akkarajitsakul, and S. Wangz, “Two-phase multi-party computation enabled privacy-preserving federatedlearning,” arXiv preprint arXiv:2005.11901, 2020.

[102] M. Hao, H. Li, X. Luo, G. Xu, H. Yang, and S. Liu, “Efficientand privacy-enhanced federated learning for industrial artificial intelli-gence,” IEEE Transactions on Industrial Informatics, vol. 16, no. 10,pp. 6532–6542, 2019.

[103] K. Zhang, Y. Zhu, S. Maharjan, and Y. Zhang, “Edge intelligence andblockchain empowered 5G beyond for the industrial internet of things,”IEEE Network, vol. 33, no. 5, pp. 12–19, 2019.

[104] B. Qolomany, K. Ahmad, A. Al-Fuqaha, and J. Qadir, “Particle swarmoptimized federated learning for industrial IoT and smart city services,”arXiv preprint arXiv:2009.02560, 2020.

[105] F. Tao, Q. Qi, L. Wang, and A. Nee, “Digital twins and cyber–physicalsystems toward smart manufacturing and Industry 4.0: correlation andcomparison,” Engineering, vol. 5, no. 4, pp. 653–661, 2019.

[106] F. Learning, “a step closer towards confidential AI,” 2020.[107] M. Chen, Z. Yang, W. Saad, C. Yin, H. V. Poor, and S. Cui, “A joint

learning and communications framework for federated learning overwireless networks,” IEEE Transactions on Wireless Communications,pp. 1–14, 2020.

[108] M. S. H. Abad, E. Ozfatura, D. Gunduz, and O. Ercetin, “Hierarchicalfederated learning across heterogeneous cellular networks,” in ICASSP2020-2020 IEEE International Conference on Acoustics, Speech andSignal Processing (ICASSP). IEEE, 2020, pp. 8866–8870.

[109] L. Huang, X. Feng, C. Zhang, L. Qian, and Y. Wu, “Deep reinforcementlearning-based joint task offloading and bandwidth allocation for multi-user mobile edge computing,” Digital Communications and Networks,vol. 5, no. 1, pp. 10–17, 2019.

[110] L. U. Khan, M. Alsenwi, I. Yaqoob, M. Imran, Z. Han, and C. S. Hong,“Resource optimized federated learning-enabled cognitive internet ofthings for smart industries,” IEEE Access, vol. 8, pp. 168 854–168 864,2020.

[111] T. Hiessl, D. Schall, J. Kemnitz, and S. Schulte, “Industrial fed-erated learning–requirements and system design,” arXiv preprintarXiv:2005.06850, 2020.

[112] Y. Lu, X. Huang, K. Zhang, S. Maharjan, and Y. Zhang, “Low-latency federated learning and blockchain for edge association indigital twin empowered 6G networks,” IEEE Transactions on IndustrialInformatics, pp. 1–10, 2020.

[113] S. Messaoud, A. Bradai, O. Ben Ahmed, P. Quang, M. Atri, andM. S. Hossain, “Deep federated Q-learning-based network slicing forindustrial IoT,” IEEE Transactions on Industrial Informatics, pp. 1–11,2020.

[114] H. Zhang and L. Hanzo, “Federated learning assisted multi-UAVnetworks,” IEEE Transactions on Vehicular Technology, 2020.

[115] Q.-V. Pham, T. Huynh-The, M. Alazab, J. Zhao, and W.-J. Hwang,“Sum-rate maximization for UAV-assisted visible light communicationsusing NOMA: Swarm intelligence meets machine learning,” IEEEInternet of Things Journal, vol. 7, no. 10, pp. 10 375–10 387, 2020.

[116] L. Atzori, A. Iera, and G. Morabito, “The internet of things: A survey,”Computer networks, vol. 54, no. 15, pp. 2787–2805, 2010.

[117] B. McMahan and D. Ramage, “Federated learning: Collaborativemachine learning without centralized training data,” Google ResearchBlog, vol. 3, 2017.

[118] S. R. Pokhrel, “Towards efficient and reliable federated learning usingblockchain for autonomous vehicles,” Computer Networks, p. 107431,2020.

[119] R. James et al., “The internet of things: a study in hype, reality,disruption, and growth,” Raymond James US Research, Technology &Communications, Industry Report, 2014.

[120] S. Zeadally, R. Hunt, Y.-S. Chen, A. Irwin, and A. Hassan, “Vehicularad hoc networks (VANETS): status, results, and challenges,” Telecom-munication Systems, vol. 50, no. 4, pp. 217–241, 2012.

[121] J. T. Isaac, S. Zeadally, and J. S. Camara, “Security attacks andsolutions for vehicular ad hoc networks,” IET communications, vol. 4,no. 7, pp. 894–903, 2010.

[122] J. Guerrero-Ibanez, C. Flores-Cortes, and S. Zeadally, “Vehicular ad-hoc networks (VANETS): architecture, protocols and applications,” inNext-generation wireless technologies. Springer, 2013, pp. 49–70.

[123] Z. Du, C. Wu, T. Yoshinaga, K.-L. A. Yau, Y. Ji, and J. Li, “Federatedlearning for vehicular internet of things: Recent advances and openissues,” IEEE Open Journal of the Computer Society, vol. 1, pp. 45–61, 2020.

[124] M. Zolanvari, M. A. Teixeira, L. Gupta, K. M. Khan, and R. Jain,“Machine learning-based network vulnerability analysis of industrialinternet of things,” IEEE Internet of Things Journal, vol. 6, no. 4, pp.6822–6834, 2019.

[125] Z. Zhou, X. Chen, E. Li, L. Zeng, K. Luo, and J. Zhang, “Edgeintelligence: Paving the last mile of artificial intelligence with edgecomputing,” Proceedings of the IEEE, vol. 107, no. 8, pp. 1738–1762,2019.

Page 24: Fusion of Federated Learning and Industrial Internet of ...

24

[126] G. Xu, H. Li, Y. Dai, K. Yang, and X. Lin, “Enabling efficient andgeometric range query with access control over encrypted spatial data,”IEEE Transactions on Information Forensics and Security, vol. 14,no. 4, pp. 870–885, 2018.

[127] T. Pang, K. Xu, C. Du, N. Chen, and J. Zhu, “Improving adver-sarial robustness via promoting ensemble diversity,” arXiv preprintarXiv:1901.08846, 2019.

[128] U. Hwang, J. Park, H. Jang, S. Yoon, and N. I. Cho, “Puvae: Avariational autoencoder to purify adversarial examples,” IEEE Access,vol. 7, pp. 126 582–126 593, 2019.

[129] J. Konecny, H. B. McMahan, D. Ramage, and P. Richtarik, “Federatedoptimization: Distributed machine learning for on-device intelligence,”arXiv preprint arXiv:1610.02527, 2016.

[130] K. Bonawitz, V. Ivanov, B. Kreuter, A. Marcedone, H. B. McMahan,S. Patel, D. Ramage, A. Segal, and K. Seth, “Practical secure aggre-gation for privacy-preserving machine learning,” in Proceedings of the2017 ACM SIGSAC Conference on Computer and CommunicationsSecurity, 2017, pp. 1175–1191.

[131] R. C. Geyer, T. Klein, and M. Nabi, “Differentially private federatedlearning: A client level perspective,” arXiv preprint arXiv:1712.07557,2017.

[132] A. Husakovic, E. Pfann, and M. Huemer, “Robust machine learningbased acoustic classification of a material transport process,” in 201814th Symposium on Neural Networks and Applications (NEUREL).IEEE, 2018, pp. 1–4.

[133] B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas,“Communication-efficient learning of deep networks from decentral-ized data,” in Artificial Intelligence and Statistics. PMLR, 2017, pp.1273–1282.

[134] T. T. Anh, N. C. Luong, D. Niyato, D. I. Kim, and L.-C. Wang,“Efficient training management for mobile crowd-machine learning:A deep reinforcement learning approach,” IEEE Wireless Communica-tions Letters, vol. 8, no. 5, pp. 1345–1348, 2019.

[135] X. Yang, W. Zhang, H. Lu, and L. Zhao, “V2V routing in VANETbased on heuristic q-learning.” International Journal of Computers,Communications & Control, vol. 15, no. 5, 2020.

[136] C. Wu, T. Yoshinaga, Y. Ji, and Y. Zhang, “Computational intelligenceinspired data delivery for vehicle-to-roadside communications,” IEEETransactions on Vehicular Technology, vol. 67, no. 12, pp. 12 038–12 048, 2018.

[137] N. R. Jennings, “The ARCHON system and its applications,”in 2nd Int. Conf. on Cooperating Knowledge Based Systems(CKBS-94) (01/01/94), 1994, pp. 13–29. [Online]. Available: https://eprints.soton.ac.uk/252176/

[138] Y. W. Lee, D. M. Strong, B. K. Kahn, and R. Y. Wang, “AIMQ:a methodology for information quality assessment,” Information &management, vol. 40, no. 2, pp. 133–146, 2002.

[139] K. Bonawitz, H. Eichner, W. Grieskamp, D. Huba, A. Ingerman,V. Ivanov, C. Kiddon, J. Konecny, S. Mazzocchi, H. B. McMahan et al.,“Towards federated learning at scale: System design,” arXiv preprintarXiv:1902.01046, 2019.

[140] Y. Jin, X. Wei, Y. Liu, and Q. Yang, “A survey towards federatedsemi-supervised learning,” arXiv preprint arXiv:2002.11545, 2020.

[141] T. Zeng, O. Semiari, W. Saad, and M. Bennis, “Joint communicationand control for wireless autonomous vehicular platoon systems,” IEEETransactions on Communications, vol. 67, no. 11, pp. 7907–7922,2019.

[142] A. Ferdowsi, U. Challita, and W. Saad, “Deep learning for reliable mo-bile edge analytics in intelligent transportation systems: An overview,”ieee vehicular technology magazine, vol. 14, no. 1, pp. 62–70, 2019.

[143] A. Challoner and G. H. Popescu, “Intelligent sensing technology, smarthealthcare services, and internet of medical things-based diagnosis,”American Journal of Medical Research, vol. 6, no. 1, pp. 13–18, 2019.

[144] E. Omanovic-Miklicanin, M. Maksimovic, and V. Vujovic, “The futureof healthcare: nanomedicine and internet of nano things,” Folia MedicaFacultatis Medicinae Universitatis Saraeviensis, vol. 50, no. 1, 2015.

[145] S. Wang, J. Wan, D. Zhang, D. Li, and C. Zhang, “Towards smartfactory for Industry 4.0: A self-organized multi-agent system with bigdata based feedback and coordination,” Computer Networks, vol. 101,pp. 158–168, 2016.

[146] P. Gope and T. Hwang, “BSN-care: A secure IoT-based modernhealthcare system using body sensor network,” IEEE sensors journal,vol. 16, no. 5, pp. 1368–1376, 2015.

[147] L. C. C. Gradim, M. A. Jose, D. M. C. da Cruz, and R. de Deus Lopes,“IoT services and applications in rehabilitation: An interdisciplinaryand meta-analysis review,” IEEE Transactions on Neural Systems andRehabilitation Engineering, vol. 28, no. 9, pp. 2043–2052, 2020.

[148] S. R. Pokhrel and J. Choi, “Federated learning with blockchain forautonomous vehicles: Analysis and design challenges,” IEEE Transac-tions on Communications, vol. 68, no. 8, pp. 4734–4746, 2020.

[149] S. Tuli, S. Tuli, G. Wander, P. Wander, S. S. Gill, S. Dustdar,R. Sakellariou, and O. Rana, “Next generation technologies for smarthealthcare: Challenges, vision, model, trends and future directions,”Internet Technology Letters, vol. 3, no. 2, p. e145, 2020.

[150] B. Diene, J. J. Rodrigues, O. Diallo, E. H. M. Ndoye, and V. V. Koro-taev, “Data management techniques for internet of things,” MechanicalSystems and Signal Processing, vol. 138, p. 106564, 2020.

[151] X. Zheng, J. Lu, S. Sun, and D. Kiritsis, “Decentralized industrial IoTdata management based on blockchain and IPFS,” in IFIP Interna-tional Conference on Advances in Production Management Systems.Springer, 2020, pp. 222–229.

[152] N. E. Petroulakis, E. Lakka, E. Sakic, V. Kulkarni, K. Fysarakis,I. Somarakis, J. Serra, L. Sanabria-Russo, D. Pau, M. Falchetto et al.,“Semiotics architectural framework: End-to-end security, connectivityand interoperability for industrial IoT,” in 2019 Global IoT Summit(GIoTS). IEEE, 2019, pp. 1–6.

[153] P. L. G. Ramırez, M. Taha, J. Lloret, and J. Tomas, “An intelligentalgorithm for resource sharing and self-management of wireless-IoT-gateway,” IEEE Access, vol. 8, pp. 3159–3170, 2019.

[154] M. Platenius-Mohr, S. Malakuti, S. Gruner, J. Schmitt, and T. Gold-schmidt, “File-and API-based interoperability of digital twins by modeltransformation: An IIoT case study using asset administration shell,”Future Generation Computer Systems, vol. 113, pp. 94–105, 2020.

[155] Y. Aono, T. Hayashi, L. Wang, S. Moriai et al., “Privacy-preservingdeep learning via additively homomorphic encryption,” IEEE Transac-tions on Information Forensics and Security, vol. 13, no. 5, pp. 1333–1345, 2017.

[156] L. Melis, C. Song, E. De Cristofaro, and V. Shmatikov, “Inference at-tacks against collaborative learning,” arXiv preprint arXiv:1805.04049,vol. 13, 2018.

[157] S. Rahimian, T. Orekondy, and M. Fritz, “Sampling attacks: Ampli-fication of membership inference attacks by repeated queries,” arXivpreprint arXiv:2009.00395, 2020.

[158] E. Bagdasaryan, A. Veit, Y. Hua, D. Estrin, and V. Shmatikov, “How tobackdoor federated learning,” in International Conference on ArtificialIntelligence and Statistics. PMLR, 2020, pp. 2938–2948.

[159] R. Tourani, S. Misra, T. Mick, and G. Panwar, “Security, privacy,and access control in information-centric networking: A survey,” IEEEcommunications surveys & tutorials, vol. 20, no. 1, pp. 566–600, 2017.

[160] A. K. Mishra, A. K. Tripathy, D. Puthal, and L. T. Yang, “Analyticalmodel for sybil attack phases in internet of things,” IEEE Internet ofThings Journal, vol. 6, no. 1, pp. 379–387, 2018.

[161] S. Wang, T. Tuor, T. Salonidis, K. K. Leung, C. Makaya, T. He, andK. Chan, “Adaptive federated learning in resource constrained edgecomputing systems,” IEEE Journal on Selected Areas in Communica-tions, vol. 37, no. 6, pp. 1205–1221, 2019.

[162] T. Nishio and R. Yonetani, “Client selection for federated learningwith heterogeneous resources in mobile edge,” in ICC 2019-2019 IEEEInternational Conference on Communications (ICC). IEEE, 2019, pp.1–7.

[163] Y. Liu, N. Kumar, Z. Xiong, W. Y. B. Lim, J. Kang, and D. Niyato,“Communication-efficient federated learning for anomaly detection inindustrial internet of things,” in GLOBECOM 2020-2020 IEEE GlobalCommunications Conference (GLOBECOM). IEEE, 2020, pp. 1–6.

[164] Y. Song, T. Liu, T. Wei, X. Wang, Z. Tao, and M. Chen, “FDA3:Federated defense against adversarial attacks for cloud-based IIoTapplications,” IEEE Transactions on Industrial Informatics, pp. 1–9,2020.