PREDICTION AND PREVENTION OF DOMESTIC VIOLENCE FROM …

14
PREDICTION AND PREVENTION OF DOMESTIC VIOLENCE FROM SOCIAL BIG DATA USING MACHINE LEARNING APPROACH Chandramohan B 1 , Arunkumar T 2 , 1 Associate Professor, School of CSE, VIT Vellore, India [email protected], +91-9976611188 2 Professor and Dean, School of CSE, VIT Vellore, India June 13, 2018 Abstract Now-a-days, most crucial and commercially valuable information are available on World Wide Web. Especially, social communication tools are gathering the interest of both scientific and business world. Hence, social networks will leads to the developments of the science of opinion, sentiment analysis and mining. This proposal will monitor the big data of social media and analyses the offences using machine learning. The proposed prediction model will identify domestic criminal behaviors. The proposed prediction model uses machine learning approach for predicting domestic violence such as suicide, crimes in the student community, crime to elderly and orphanage persons. The proposed system implements prediction of domestic crimes such as crime against elderly people, suicide and crime against girls. It introduces the possibility of analyzing the temporal evolution of the connections among individuals of the network. 1 International Journal of Pure and Applied Mathematics Volume 120 No. 6 2018, 3549-3561 ISSN: 1314-3395 (on-line version) url: http://www.acadpubl.eu/hub/ Special Issue http://www.acadpubl.eu/hub/ 3549

Transcript of PREDICTION AND PREVENTION OF DOMESTIC VIOLENCE FROM …

PREDICTION AND PREVENTIONOF DOMESTIC VIOLENCE FROM

SOCIAL BIG DATA USING MACHINELEARNING APPROACH

Chandramohan B1, Arunkumar T2,1Associate Professor, School of CSE,

VIT Vellore, [email protected], +91-99766111882Professor and Dean, School of CSE,

VIT Vellore, India

June 13, 2018

Abstract

Now-a-days, most crucial and commercially valuableinformation are available on World Wide Web. Especially,social communication tools are gathering the interest ofboth scientific and business world. Hence, social networkswill leads to the developments of the science of opinion,sentiment analysis and mining. This proposal will monitorthe big data of social media and analyses the offencesusing machine learning. The proposed prediction modelwill identify domestic criminal behaviors. The proposedprediction model uses machine learning approach forpredicting domestic violence such as suicide, crimes in thestudent community, crime to elderly and orphanagepersons. The proposed system implements prediction ofdomestic crimes such as crime against elderly people,suicide and crime against girls. It introduces thepossibility of analyzing the temporal evolution of theconnections among individuals of the network.

1

International Journal of Pure and Applied MathematicsVolume 120 No. 6 2018, 3549-3561ISSN: 1314-3395 (on-line version)url: http://www.acadpubl.eu/hub/Special Issue http://www.acadpubl.eu/hub/

3549

Keywords:Criminal Analysis, Machine Learning,Social Network, Prediction Model, Big Data.

1 INTRODUCTION

Facing the huge amount of information which present on thesocial networks are the crucial task and it leads to the study &creation of efficient models. Current researches on social networksare majorly focused on efficient approach to support emotionrecognition, sentiment analysis, and polarity detection in naturallanguage text. Sentiment analysis has applications in diversecontexts such as analysis of opinions of individuals about variousproducts, issues, social, and political events. Understandingpublic opinion can help to improve decision making. Opinionmining is a way of retrieving information via search engines, blogs,micro-blogs and social networks. Individual opinions are unique toeach person, and Twitter tweets are an invaluable source of thistype of data.

However, the huge volume and unstructured nature oftext/opinion data pose a challenge to analyzing the dataefficiently. Accordingly, proficient algorithms/ computationalstrategies are required for mining and condensing tweets as well asfinding sentiment bearing words.

Most existing computational methods / models / algorithmsin the literature for identifying sentiments from such unstructureddata rely on machine learning techniques as their basis.

2 CRIME ANALYSIS FROM SOCIAL

NETWORK

Data mining can be used to make accurate predictions and it isapplied to real world situations such as stock market forecast.Crime Analysis is a kind of application in data mining which is toidentify the patterns having criminal behaviour. Crime analysisused to predict crime, anticipate criminal activity and prevent it.

Machine learning is technique which gives computers theability to learn without being explicitly programmed. Machine

2

International Journal of Pure and Applied Mathematics Special Issue

3550

learning provides better result for analysis and prediction,especially for large data set and big data. There are variety ofmachine learning algorithms available that can be implemented ondatasets. The performance of machine learning lies in theunderlaying learning algorithm. There are two types of learningalgorithms which are supervised learning and unsupervisedlearning algorithms.

Supervised learning algorithms work by inferring informationor ”the right answer” from labelled training data, hence it is calledtraining. Unsupervised learning algorithms, however, aim to findhidden structures in unlabelled class data. Machine learningalgorithms used to conduct the following four major analysis: (i)Classification, (ii) Regression (iii) Clustering (iv) Association.

This proposal identifies the patterns from social media bigdataby using machine learning algorithm which having crimeinvolvement, especially domestic crime such as miscreants, suicide.The proposed machine learning has designed by using aself-motivated learning algorithm (Chebiyyam et al, 2014). Theself-motivated learning is introduced in the training phase of animproved functional link feedforward neural network forimproving the result of prediction model.

The proposed system implements this computationalframework, which encoding the entire work flow of proposedsystem. The data model is designed to improve the quality of theanalysis of social relationships observed from social big datathrough the integration of visualization and social networkanalysis-based statistical metrics. Our previous implementationshows the swarm based implementation on prediction and analysis(Chandramohan, 2015).

3 Related Work

Crime analysis from Big Data to discover the incidence of crimehas been initiated by the Los Angeles Police Department in 2010.The most well-known predictive policing software PredPol hasbeen developed and first used by Los Angeles and Santa Cruzpolice departments. The software can predict where crimes arelikely to occur with 500 square feet precision. Karl et al (2010),

3

International Journal of Pure and Applied Mathematics Special Issue

3551

found that those who are more compulsive in their use of theinternet are more likely to post problematic material, such assubstance use. Meanwhile, those who are more conscientious,agreeable, and emotionally stable are less likely to post this typeof content.

In LA, by using the Data-powered system, theres been a 33%reduction in burglaries and 21% reduction in violent crimes in areaswhere the software was implemented. After PredPol [PREDPOL]has been applied by Atlanta Police Department, the city has seen anoverall crime drop 19% and examples of other effective deploymentproliferate. PredPol software is currently being trailed in over 150cities across the US. Like PredPol, some more tools are relatedto our proposal, they are: commercial tools like COPLINK (Chenet al, 2003), Analyst’s Notebook, Xanalysis Link Explorer (Xu &Chen, 2005a,b) and Palantir Government, Sandbox (Wright et al,2006) and POLESTAR (Pioch & Everett, 2006).

Das and Jyothi (2011), Mittal and Singh (2014) studied socialmedia crimes, its impact in India and reported it is higher thanreal world crimes. Agarwal et al (2014) analyses historicalinformation and its relationship with structural characteristics ofthe communities that are exposed to cybercrime. Domestic crimeslike suicide by girls and elderly peoples are studied by VenkobaRao (1991) in real time, Mishra and Patel (2013) in social mediareported its social impact. Similarly the crime in urban mobilesharing was studied by Smyth et al (2010).

India accounts for close to 8bnofthetotal110bn cost of globalcybercrime (Mishra et al, 2016). Nowadays hackers are creatingshort URLs and post it on friends board. Lured users unknowinglyclicking on these links are redirected to malicious websites. Tocontrol such type of activities, trust score for each user is suggestedby Venkatesh et al (2017). Based on the trust score, it is identifiedthat the user is trustable or not.

Singh et al (2017) proposed sentiment analysis using machinelearning and natural language processing for improving businessintelligence in the social network. The authors taken the reviewsposted in the large scale nature of social networks and they performaggregation operations on millions of reviews, which results in theform of user-friendly Pie-Chart format.

Gupta and Sardana (2017) proposed Nave Bayesian Algorithm

4

International Journal of Pure and Applied Mathematics Special Issue

3552

for predicting missing links between an Individuals in the socialnetworks. As the social network are huge in size and have millionsof nodes, it is complex to analyse. Hence, the authors are narrowingdown the social network into ego network. The ego network is apart of social network which has the individual’s own social networkwith their direct connections known as alters. The authors appliednaive Bayes for predicting new links, and analysing the similarityscore existing link between two nodes in the ego network. Highersimilarity score leads to the possibility of new links between nodes(Marietta et al, 2016).

Identifying mood of a person from social network texts likeTwitter is a recent research which involves finding the underlyingsemantic of the text messages. Gaikwad and Joshi (2017)proposed a machine learning algorithm for finding sentimentsusing lexicon database. The authors identifies a single word whichaffecting the whole sentence, called ’Impact Factor’. ImpactFactor is a measure which shows the influence of the word in theoverall semantic of the sentence. The word which has the highervalue of Impact Factor in the sentence was chosen as moreinfluential word (Arulmozhi et al, 2011).

Sarda and Chouhan (2017) proposed characterization ofnon-situational tweets into a set of fine-grained classes from themillions of tweets posted during national violence and disasterutilizing state-of-the-art machine learning technique. The authorsconcluded that this characterization helps in filtering outcommunal tweets which can make worst the situation bydisrupting communal harmony during certain disaster events.

Padmaja et al (2016) measured hierarchy, density, farness,reachability and eigenvector of an individual from the socialnetworks which influences the individuals stress level usingTree-Net machine learning algorithm. The author tested theirimplementation using Random Forest classifier (Chandra Mohanet al, 2012). The author concluded that if an ego lies in theshortest path of all other alters then the ego receives moreinformation and hence it is more stressed and this conclusionhelps to economists, professionals, analysts, and policy makers.

5

International Journal of Pure and Applied Mathematics Special Issue

3553

4 PROPOSED METHOD

The social big data are initially preprocessed and separated astraining and testing data. The training data is given as input tothe proposed Artificial Neural Network (ANN) Model. The resultof the ANN was processed by sentiment analysis and furtherstored as feedback in the data sets to moderate the data sets. Theself-motivated learning is used to improve the performance ofANN model. The block diagram of the proposed system is shownin figure 1.

Figure 1: Flow Diagram of proposed prediction system

Crimes involved as groups are identified using the above twosteps, whereas the domestic crimes related to individual centricsuch as suicide and sexual abuse may be identified by using machinelearning approach. The knowledge learned using machine learningapproach may change according to age group, geographical locationand crime involved.

In the proposed system, an additional input is used to trainthe neural network. The architecture of the proposedself-motivated ANN (Analyzable Structured Neural Network -ASNN) with improved functional link architecture is shown infigure.1. Here X [X1, ..Xn] is an input (training tuples), Xi & Xjare additional inputs. Xi generated based on functional link andXj generated based on output.

Finally, it provides an unprecedented supervised communitydetection set of techniques that allows detectives to interact withthe community detection process, incorporating expert knowledgeto supervise the results and refine the unveiled communitystructure at different levels of granularity and resolution.

6

International Journal of Pure and Applied Mathematics Special Issue

3554

Figure 2: Proposed self-motivated ANN with improved functionallink (FL) architecture

The relevant work aimed at detecting network communities,discovering their patterns of interaction, identifying centralindividuals, and uncovering network organization and structure.

Proposed work represents the next-generation criminalinvestigation expert system. It introduces significantimprovements over existing tools, and it provides specific supportto detecting criminal activities in the social data.

The problem of finding communities in a network is oftenformalized as a clustering problem. There is one widely adoptedapproach to solve this problem, based on the concept of networkmodularity, which can be explained as follows: let consider anetwork, represented by means of a graph G = (V, E), which hasbeen partitioned into m communities.

5 RESULTS

Initial level of implementation are executed, the architure of socialdata in the form of social network is shown in figure 3.The resultof clustered data is shown in figure 4. The intra communityrelationship between social networks are shown in figure 5.

7

International Journal of Pure and Applied Mathematics Special Issue

3555

Figure 3: Social network architecture

Figure 4: Clusters detected in the social network

8

International Journal of Pure and Applied Mathematics Special Issue

3556

Figure 5: Intra-community relationships

6 CONCLUSION

The crime analysis is an important and critical task in the recentdecades. Social network gives opportunity to the miscreants toupload and spread unanimous information, which lead crime inmany ways. Enormous domestic violence such as crimes againstelders, girls and suicides are possible to predict from the social bigdata. This paper applies machine learning approach to design aprediction model for predicting domestic violence from social bigdata.

References

[1] Agarwal, V.K., Garg, S.K., Kapil, M., Sinha, D., Cyber CrimeInvestigations in India: Rendering Knowledge from the Pastto Address the Future, Advances in Intelligent Systems andComputing, Vol. 249, pp. 593-600, 2014.

[2] Chen, H., COPLINK Connect: information and knowledgemanagement for law enforcement, Decision Support Systems,Vol. 34, No. 3, pp. 271285, 2003

[3] Das, B., and Jyoti S. S., ”Social networking sites-A criticalanalysis of its impact on personal and social life.” InternationalJournal of Business and Social Science, Vol.2, p.14, 2011.

9

International Journal of Pure and Applied Mathematics Special Issue

3557

[4] Karl, Katherine, Joy, and Schlaegel. ”Who’s postingFacebook faux pas? A crosscultural examination ofpersonality differences.” International Journal of Selection andAssessment, Vol.18, No.2, pp.174-186, 2010.

[5] Mishra, A.J. and Patel, A.B., Crimes against the elderly inIndia: a content analysis on factors causing fear of crime,International Journal of Criminal Justice Sciences, Vol.8, No.1,p.13. 2013

[6] Mishra, S., Dhir, S., Hooda, M., A study on cybersecurity, its issues and cybercrime rates in India, 3rdInternational Conference on Innovations in Computer Scienceand Engineering, ICICSE 2015; Advances in IntelligentSystems and Computing, Vol. 413, pp 249-253, 2016

[7] Mittal, S., Singh, A., A study of cybercrime and perpetrationof cybercrime in India, Evolving Issues Surrounding Techno-ethics and Society in the Digital Age, pp. 171-186, 2014

[8] Morselli, C., Inside criminal networks, Studies of OrganizedCrime, Springer, 8th Ed, 2009.

[9] Pioch, N.J., and Everett J. O., ”POLESTAR: collaborativeknowledge management and sense making tools for intelligenceanalysts”, Proceedings of the 15th ACM internationalconference on Information and knowledge management. ACM,pp. 513-521, 2006.

[10] PREDPOL available online at ¡http://www.predpol.com/¿

[11] Smyth, T. N., Satish K., Indrani M., and Kentaro T., ”Wherethere’s a will there’s a way: mobile media sharing in urbanIndia”, proceedings of the SIGCHI conference on HumanFactors in computing systems, ACM, pp. 753-762, 2010.

[12] Venkatesh, R. , Rout, J.K. , Jena, S.K., Malicious accountdetection based on short URLs in twitter, proceedings ofInternational Conference on Signal, Networks, Computing, andSystems; Lecture Notes in Electrical Engineering, Vol. 395, pp.243-251, 2017

10

International Journal of Pure and Applied Mathematics Special Issue

3558

[13] Venkoba Rao, A., ”Suicide in the elderly: A report fromIndia”, Crisis: The Journal of Crisis Intervention and SuicidePrevention, 1991.

[14] Wright, Schroh, Proulx, Skaburskis, & Cort, ”The Sandbox foranalysis: concepts and methods.” Proceedings of the SIGCHIconference on Human Factors in computing systems. ACM,2006.

[15] Xu, J. J., & Chen, H., Crimenet explorer: A framework forcriminal network knowledge discovery, ACM Transactions onInformation Systems, Vol. 23, No.2, pp.201226, 2005b.

[16] Xu, J., & Chen, H., Criminal network analysis andvisualization Communications of the ACM, Vol. 48, No.6,pp.100107, 2005a.

[17] Singh, B., Kushwaha, N., Vyas, O.P., An interpretation ofsentiment analysis for enrichment of Business Intelligence,2017, Proceedings/TENCON of IEEE Region 10 AnnualInternational Conference, pp. 18-23.

[18] Gupta, A.K., Sardana, N., Naive Bayes approach for predictingmissing links in ego networks, 2017, Proceedings of IEEEInternational Symposium on Nanoelectronic and InformationSystems (iNIS 2016), pp. 161-165

[19] Gaikwad, G., Joshi, D.J., Multiclass Mood classificationon twitter using lexicon dictionary and machine learningalgorithms, 2017, Proceedings of the International Conferenceon Inventive Computation Technologies (ICICT 2016), pp.1-8

[20] Sarda, P., Chouhan, R.L., Extracting non-situationalinformation from twitter during disaster events, 2017, Journalof Cases on Information Technology, Vol. 19, No.1, pp. 15-23

[21] Padmaja, B., Prasad, V.V.R., Sunitha, K.V.N., TreeNetanalysis of human stress behavior using socio-mobile data,2016, Journal of Big Data, Vol.3, No.1, pp.24-32

[22] Chandramohan, B., Restructured ant colony optimizationrouting protocol for next generation network, 2015,

11

International Journal of Pure and Applied Mathematics Special Issue

3559

International Journal of Computers, Communicationsand Control, 10(4), pp. 492-499

[23] Chebiyyam, S.K., Ab, Chandramoham, Review on securedcloud environment using cryptographic schemes, 2014,International Journal of Applied Engineering Research, 9(24),pp. 30091-30097

[24] Marietta, J., Chandra Mohan, B., Review on recent researchin iot and its various applications, 2016, International Journalof Pharmacy and Technology, 8(4), pp. 22279-22295

[25] Arulmozhi, K.S., Karthikeyan, R., Mohan, B.C., Optimizingresource sharing in cloud computing, 2011, Communicationsin Computer and Information Science, 148 CCIS, pp. 50-55

[26] Chandra Mohan, B. and Baskaran, R. A Survey: Ant ColonyOptimization based recent research in various engineeringdomains Expert System with Application, Elsevier, Vol. 39,No. 4, pp. 4618-4627, 2012

12

International Journal of Pure and Applied Mathematics Special Issue

3560

3561

3562