Filtering of Spam E-MailsUsing Back-Propagation
Neural Networks
Class :資四AProfessor :楊維忠Reporter :林文仁
Team Members :江念庭林俊宇黃國峰
Outline
• Neural Network
• Back-propagation algorithm
• Flow chart of research
• Input & output
• System environment
• Flow chart of filtering e-mail
• Example
• Conclusion
Neural Network
InputOutput
Compare
Adjust weights
Target
Neural Network connections
(called weights) between neurons
Back-propagation algorithm—the multilayer feedforward network
……
Hidden layer Output layerInput layer
Σ
b1
Σb
1
w1
wi
neuron1
Forward pass
neuronj
wi: weight of i
: transfer function
b: bias
result
…… ……
neuron2
……
Flow chart of research參考文獻
分析 mail & maillog, 定義垃圾郵件行為
樣本訓練
類神經網路
與郵件伺服器相互整合
測試網路適用並結束訓練
測試網路不適用並重新訓練
Table of rules
Header maillogReply-To Date from to
Header
To 6 1From 17
subject 16
maillog
from 22Date 25
nrcpts 28
Input & output
• Input– 共有 28 項規則,底下提出常遇到的項目。
• 6 為 header-To( 收件人 ) == header-Reply-To( 收回覆信的人 ) ,則 input 第 6 項的值為 1
• 17 為 header-From( 寄件人 ) != maillog-from( 記錄檔裡的寄件人 ) ,則 input 第 17 項值為 1
• 25 為 header-Date( 發信時間 ) 與 系統時間 差異太大,則 input 第 25 項值為 1
• Output– Output value between 0.0 and 1.0
System environment
• OS– Red Hat Enterprise Linux AS 4
• Mail server– Sendmail 8.13.1
• Client using browser– OpenWebMail 2.52
• Provide web GUI for checking mail
• Software tools– Matlab 7
Add, Change headers
Milter(Mail Filte
r)
Matlab BPN (Neural Netwo
rk)
Flow chart of filtering e-mail
header
get_value
maillog
Sendmail server
User’s mailbox
Example-1透過 telnet傳遞一封垃圾信
ehlo localhostMail from: [email protected] TO: [email protected]: “s” [email protected]: [email protected]: [email protected]: 中文信Date: +0800….Quit
Example
收到信件並已偵測為 SPAM
Content of headers
收件人與收回覆的 email 相同,常理應不相同 .
Example-2
Server 上 Maillog 的內容
Conclusion
• Identification rate 80%.≒• Defined rules with subjectiveness.
• Better to combine filtering of content.– eg. SpamAssassin
Please give us your comments.
Thank you.
Top Related