Natural Language Processing Lab National Taiwan University The splog Detection Task and A Solution...
-
Upload
oswald-campbell -
Category
Documents
-
view
218 -
download
0
description
Transcript of Natural Language Processing Lab National Taiwan University The splog Detection Task and A Solution...
![Page 1: Natural Language Processing Lab National Taiwan University The splog Detection Task and A Solution Based on Temporal and Link Properties Yu-Ru Lin et al.](https://reader034.fdocuments.net/reader034/viewer/2022052419/5a4d1b627f8b9ab0599ae251/html5/thumbnails/1.jpg)
Natural Language Processing LabNational Taiwan University
The splog Detection Task and A Solution Based on Temporal and Link PropertiesYu-Ru Lin et al.
NEC AmericaTREC 2006 (Blog session)
Presentor: Chun-Yuan Teng
![Page 2: Natural Language Processing Lab National Taiwan University The splog Detection Task and A Solution Based on Temporal and Link Properties Yu-Ru Lin et al.](https://reader034.fdocuments.net/reader034/viewer/2022052419/5a4d1b627f8b9ab0599ae251/html5/thumbnails/2.jpg)
Natural Language Processing LabNational Taiwan University
Splog characteristics• Machine-generated content• No Value-addition
– No unique information to their readers• Hidden agenda, usually an economic
goal– Commercial intention
![Page 3: Natural Language Processing Lab National Taiwan University The splog Detection Task and A Solution Based on Temporal and Link Properties Yu-Ru Lin et al.](https://reader034.fdocuments.net/reader034/viewer/2022052419/5a4d1b627f8b9ab0599ae251/html5/thumbnails/3.jpg)
Natural Language Processing LabNational Taiwan University
Uniqueness of splogs• Dynamic content
– Unlike web spam, a splog generates fresh content to drive traffic
• Non-endorsement link– Hyperlink is an endorsement of other pages– Spammers can create hyperlinks in normal bl
ogs, links in blogs is not endorsement
![Page 4: Natural Language Processing Lab National Taiwan University The splog Detection Task and A Solution Based on Temporal and Link Properties Yu-Ru Lin et al.](https://reader034.fdocuments.net/reader034/viewer/2022052419/5a4d1b627f8b9ab0599ae251/html5/thumbnails/4.jpg)
Natural Language Processing LabNational Taiwan University
Features to detect splog• Traditional features
– Tokenized URL, blog and post titles, homepage content, and post content
• Temporal regularity– Temporal content regularity/Temporal
structural regularity• Link regularity
– Consistency in target website
![Page 5: Natural Language Processing Lab National Taiwan University The splog Detection Task and A Solution Based on Temporal and Link Properties Yu-Ru Lin et al.](https://reader034.fdocuments.net/reader034/viewer/2022052419/5a4d1b627f8b9ab0599ae251/html5/thumbnails/5.jpg)
Natural Language Processing LabNational Taiwan University
Temporal Content Regularity
![Page 6: Natural Language Processing Lab National Taiwan University The splog Detection Task and A Solution Based on Temporal and Link Properties Yu-Ru Lin et al.](https://reader034.fdocuments.net/reader034/viewer/2022052419/5a4d1b627f8b9ab0599ae251/html5/thumbnails/6.jpg)
Natural Language Processing LabNational Taiwan University
Temporal Structural Regularity
![Page 7: Natural Language Processing Lab National Taiwan University The splog Detection Task and A Solution Based on Temporal and Link Properties Yu-Ru Lin et al.](https://reader034.fdocuments.net/reader034/viewer/2022052419/5a4d1b627f8b9ab0599ae251/html5/thumbnails/7.jpg)
Natural Language Processing LabNational Taiwan University
Link Regularity estimation
![Page 8: Natural Language Processing Lab National Taiwan University The splog Detection Task and A Solution Based on Temporal and Link Properties Yu-Ru Lin et al.](https://reader034.fdocuments.net/reader034/viewer/2022052419/5a4d1b627f8b9ab0599ae251/html5/thumbnails/8.jpg)
Natural Language Processing LabNational Taiwan University
Two kinds of spam detection
• Offline detection– Traditional measurement
• Online detection– Detect spam online
![Page 9: Natural Language Processing Lab National Taiwan University The splog Detection Task and A Solution Based on Temporal and Link Properties Yu-Ru Lin et al.](https://reader034.fdocuments.net/reader034/viewer/2022052419/5a4d1b627f8b9ab0599ae251/html5/thumbnails/9.jpg)
Natural Language Processing LabNational Taiwan University
Experimental Result (Offline)
![Page 10: Natural Language Processing Lab National Taiwan University The splog Detection Task and A Solution Based on Temporal and Link Properties Yu-Ru Lin et al.](https://reader034.fdocuments.net/reader034/viewer/2022052419/5a4d1b627f8b9ab0599ae251/html5/thumbnails/10.jpg)
Natural Language Processing LabNational Taiwan University
Experimental results (Offline)
![Page 11: Natural Language Processing Lab National Taiwan University The splog Detection Task and A Solution Based on Temporal and Link Properties Yu-Ru Lin et al.](https://reader034.fdocuments.net/reader034/viewer/2022052419/5a4d1b627f8b9ab0599ae251/html5/thumbnails/11.jpg)
Natural Language Processing LabNational Taiwan University
Online indexing in blog search engine
![Page 12: Natural Language Processing Lab National Taiwan University The splog Detection Task and A Solution Based on Temporal and Link Properties Yu-Ru Lin et al.](https://reader034.fdocuments.net/reader034/viewer/2022052419/5a4d1b627f8b9ab0599ae251/html5/thumbnails/12.jpg)
Natural Language Processing LabNational Taiwan University
Online test
![Page 13: Natural Language Processing Lab National Taiwan University The splog Detection Task and A Solution Based on Temporal and Link Properties Yu-Ru Lin et al.](https://reader034.fdocuments.net/reader034/viewer/2022052419/5a4d1b627f8b9ab0599ae251/html5/thumbnails/13.jpg)
Natural Language Processing LabNational Taiwan University
Online test in this paper
![Page 14: Natural Language Processing Lab National Taiwan University The splog Detection Task and A Solution Based on Temporal and Link Properties Yu-Ru Lin et al.](https://reader034.fdocuments.net/reader034/viewer/2022052419/5a4d1b627f8b9ab0599ae251/html5/thumbnails/14.jpg)
Natural Language Processing LabNational Taiwan University
Experimental results
![Page 15: Natural Language Processing Lab National Taiwan University The splog Detection Task and A Solution Based on Temporal and Link Properties Yu-Ru Lin et al.](https://reader034.fdocuments.net/reader034/viewer/2022052419/5a4d1b627f8b9ab0599ae251/html5/thumbnails/15.jpg)
Natural Language Processing LabNational Taiwan University
Conclusion and contributions
• Modeling the splog problem– The uniqueness of splog
• Regularity based detection– Content and post time
• Evaluation– Online evaluation