Branney-Gant Research Paper

The Future of Fighting Government Benefit Fraud 1

Digital Forensics and Data Mining: The Future of Fighting Government Benefit Fraud

Elizabeth Branney-Gant


Table of Contents

Abstract .............................................................................................................................. 4

Digital Forensics and Data Mining: The Future of Fighting Government Benefit

Fraud .............................................................................................................................................. 5

1. Current Challenges in Government Benefit Fraud ................................................... 5

2. Digital Forensics .................................................................................................................... 6

3. How Digital Forensics and Data Mining Can Be Conjoined to Fight Benefit Fraud:

Build a Layered Defense ........................................................................................................... 6

4. Predictive Fraud Modeling and Analytics ........................................................................... 13

5. Conclusion ............................................................................................................................ 14

References ........................................................................................................................ 15


Abstract

Government agencies are increasingly a target for government benefit fraud. As

government agencies move from in person services to Internet or multi-channel services, the

need to "know your customer" to prevent and identify fraud increases. Private sector companies

have developed many strategies to know their customers in a multi-channel environment. Public

sector agencies have unique challenges not faced by private sector agencies: legacy programs,

budgets publically decided years in advance, public procurement regulations and the scrutiny of

public opinion to name a few. This paper explores: current challenges in government benefit

fraud, how digital forensics and data mining can be conjoined to fight benefit fraud and possible

future solutions.

Keywords: digital forensics, data mining, benefit fraud, government, account change

alerts, two factor authentication, machine identification, MAC IDs, multi-page login sequence,

security images, security questions, IP geocoding, IP mapping, predictive fraud modeling, fraud

analytics, crime analytics, ideographic digital profiling, big data, personally identifiable

information, PII, identify theft.


Digital Forensics and Data Mining: The Future of Fighting Government Benefit Fraud

1. Current Challenges in Government Benefit Fraud

Government benefit fraud can take a variety of forms: tax returns, insurance programs

(unemployment, disaster, crop) and income maintenance programs (WIC, SNAP, Medicaid, W-

2). Government programs are unique targets for fraud because government programs are

designed to be accessible in order to facilitate services to individuals who are in need of services.

Furthermore, government programs are increasingly doing business in a multi-channel service

environment.

Gone are the days of face-to-face interactions in most benefit programs. As government

agencies continue to move toward providing services over the phone and Internet, the risk of

fraud increases. Proper precautions must be taken to authenticate and differentiate between the

customers the programs are designed to serve from the fraudsters looking to exploit the system.

Government benefit fraud is increasingly perpetrated through organized multi-victim identity

theft scams (U.S. Department of Justice, 2015). Consequently, government benefit programs

have the need to design a layered defense that keeps system available1 to real beneficiaries and

equally importantly denies and/or detects fraudsters’ intrusions2. This paper explores how the

fields of digital forensics, data mining and data analytics combined can be utilized to prevent and

investigate government benefit fraud.

1 Information assurance principle of “availability” means that, “the right information to the right people at

the right time”(Fick, 2009). 2 The information assurance principle of “authenticity”—establishing that the user logged into the system is really

the person whose credentials (credentials come in many forms user name or password or personally identifiable

information PII) are presented (Fick, 2009)


2. Digital Forensics

Digital Forensics is “the application of computer science and investigative procedures for

a legal purpose involving analysis of digital evidence after proper search authority, chain of

custody, validation with mathematics, use of validated tools, repeatability, reporting and possible

expert presentation. (Sammons, 2012).

Digital forensics is a relatively new forensic science (Kessler, 2010). Yet digital forensics

has many sub-fields that if utilized properly, can position government agencies to strike the right

balance between the information assurance principles of availability and authenticity. For

example network forensics analysis focuses on properly capturing, recording and analyzing

network events to identify cyber threats and crimes.

Network forensics analysis generally takes one of two forms. One form is in analyzing

network traffic and another form is in log file analysis (Sindhu & Meshram, 2012). Harnessing

the power of network forensics is mission critical component of preventing and detecting

government benefit fraud. Network forensics is the ‘what needs to be done’ section 3 explores

the ‘how it can be done.’ Section 3 proposes critical infrastructure needed to build a layered

defense to keep fraudsters out. Likewise, the infrastructures suggested, increases the rate of

detection in the event that a fraudster gets into the system.

3. How Digital Forensics and Data Mining Can Be Conjoined to Fight Benefit Fraud: Build

a Layered Defense

Building a layered defense is a long standing military security principle. As the

information age has evolved and cybercrime increased, the principle started being applied to

network and information systems security. In the context of information security ‘building a

layered defense’ is often referred to as defense in depth (Fick, 2009). Defense in depth is an


approach to managing risk that employs diverse defensive strategies. If one layer fails, another

layer presents a choke point in order to prevent a full breach (Branum, Gegick and Michael

2005).

Counter institutively, most cyber criminals have “sub-expert” skill levels when it comes

to committing cybercrime and avoiding detection. However, they can still do a great deal of

damage. On the other hand, employing the right network forensics techniques can aid in

preventing, detecting and investigating cybercrime (Steel, 2014).

The realization that most cyber criminals have “sub-expert” skill levels means that the

return on investment from building a layered defense would be high. However to efficiently

capitalize on any infrastructure investments, government agencies need to react with a sense of

urgency. Most cyber criminals are between the ages of 17 and 45 (Li, 2008). Generation X and

Millennials were the last generation to be introduced to technology in both digital and analog

terms (Garvey, 2015). Consequently the sophistication of the average cybercriminal is likely to

go up over time, due to the next generation of fraudsters being exposed to technology at a much

earlier age than the present generation of fraudsters.

Sections 3.1 through 3.6 explore options government agencies can employ based on their

fraud risk appetite. Ironically, the security options that are enumerated are the norm in the private

sector. The absence of similar options in the public sector has made government benefit

programs a target for cybercrime. Employing the right mix of these options can help government

programs ensure that they really ‘know the customer’ they are dealing with.

These options have a two-fold benefit: one benefit is that in and of themselves they can

help prevent fraud and detect fraud. A second benefit is that when fraud is detected, the

information (data) from the incidents and attacks can be collected and analyzed (data mining and


data analytics). Simultaneously, that data can be maintained in a data warehouse in order to later

build real-time predictive fraud models and analytics tools. The power of big data and predictive

fraud modeling and analytics is explored in Section 4. The ability to prevent and detect fraud

while working toward predictive modeling is the future of fighting government benefit fraud.

3.1 Account Access and Account Change Alerts. Account access and change alerts are

one of the least invasive methods government agencies can employ to prevent fraud. Real world

example: someone contacts their bank to update their address. The bank updates the address and

contacts the customer (either through mail, email, text or prerecorded call) to verify that the

customer made the requested change. The customer is told that if they did not make the change,

that they should alert the bank immediately. The technology used to set up automated

notifications for account access change alerts could later be utilized in Two-Factor

Authentication (see Section 3.2). Public sector agencies would have to identify what changes to

accounts are the most indicative of fraud for their agency. Some potential changes indicative of

fraud are changes to: address, phone, email, direct deposit and username changes etc.

3.2 Two-Factor Authentication. Two-factor authentication or 2FA, has two layers of

authentication when logging into accounts. If someone only has to enter a username and

password, then they are using single factor authentication (something they know). 2FA requires

the user to have another credential (something they have). A more common every day example

would be using a credit card (something you have) and having to provide a zip code for the

mailing address (something you know) to authenticate the transaction (Rosenblatt, 2013).


In the context of identity theft and government benefit fraud the options could look like this:

2FA is a form of secure access and identity management and if employed correctly can prevent

fraud beyond what identity proofing software alone can prevent. In this regard it can be paired

with or used separately to identify and prevent fraud. Agencies that use identity proofing should

collect failed log in attempts for analysis.

A cyber example of 2FA, that digital customers are accustomed to, would be: a customer

saving their bank account on their phone (or other device) and normally only accessing the

account from their phone (or other device). Most banks would make the user authenticate their

identity, if they tried to log in on another ‘unknown’ device. The customer would have

previously selected a communications preference (email, text, recorded voice etc.) when they set

up the online bank account. Consequently, the bank would authenticate the customer by sending

the customer a temporary password to the account (email, text, recorded voice etc.) the customer

provided when they set up their online bank account. Agencies can set thresholds, allow for

opting out, establish whitelists etcetera to account for claimants that use public computers so it

does not create a barrier to legitimate customers.

-Personally

identifiable

information (PII)

-Username

-Password

-Etc.

-Email account

-Mobile phone

-Telephone

-Machine fingerprint

-MAC ID


3.2. Machine Fingerprinting. Has long been used by companies to identify a device

over the Internet without collecting any personally identifiable information. Collecting machine

prints allows marketing companies to manage their content and the delivery of the content to the

right customer at the right time. (digitalmarketing-glossary.com)

Companies also harness this tool to identify fraudsters. For example if one device has

launched an anomalous amount of transactions, the company can flag the device. After flagging

the device the company can review its activity to determine if it is a legitimate customer or a

fraudster. Government agencies can use this tool for authentication and identification purposes.

For example, a legitimate customer is always known to log on using a particular device

(identified by machine finger printing). A fraudster then attempts to log into the systems using a

customer’s credentials (username, password, PII etc.). The system could then deny service to the

fraudster until further authentication takes place (2FA, answering security questions correctly

etc.)

This layer of defense is promising, but it has limitations. For instance, if a user changes

enough features on their device it can generate a false flag. However, the system can be designed

to fail gracefully to minimize impact on legitimate customers. Furthermore, this situation can be

minimized if machine finger printing is paired with capturing MAC IDs.

3.3 MAC IDs. Media Access Control Address or MAC ID, is a “unique identified that is

assigned to devices that allows them to optimize communications over networks” (Slagan, 2013).

MAC IDs are a mission critical component to preventing and investigating government benefit

fraud for several reasons. One reason is because of the rise in use of burner and mobile devices,

to commit cybercrime. Burner and mobile devices often have generic IP addresses assigned to


them; consequently sending subpoenas to the service providers often yields no results of

probative value.

However many wireless and burner device companies collect MAC IDs, so it can be a

valuable piece of information to collect for subpoena purposes. A second reason to collect MAC

IDs is that they can be used to differentiate between a fraudster and the true customers.

Furthermore, MAC IDs can be used to identify multi-victim frauds quickly because MAC IDs

are harder to mask than IP addresses and machine fingerprints alone.

3.4 Multi-Page Login Sequence and User Selected Security Images. Multi-page log in

sequences prevent customers (whether it is a business customer or a beneficiary customer) from

becoming the victim of phishing scams. Example without multi-page log in sequence: Customer

gets spoofed communications from government agency and follows the link to a fictitious

agency website. They enter their username and password on the first page. The fraudster now has

their username and password and any other information the customer provides before they

realize it is a spoofed website.

See examples in figures below from Nelnet.com a federal student loan company.

Claimant inputs username on 1st t

page and is directed

to 2nd

page

Claimant inputs password on 2nd

page if they recognize the

security image they selected.


Example with a multi-page log on sequence: prevents aforementioned scenario because

even if the claimant provides the fraudster with their username, the absence of their selected

security image on the second page alerts them to stop before entering their password. Also multi-

page log in sequences make bot attacks more complicated.

3. 5 Security Questions. User selected security questions and answers are a standard tool

for preventing identity theft online. However, they are not enough to prevent fraud in situations

where it is the fraudster setting up the account. Likewise, they are not enough to prevent a fraud

if a fraudster knows or obtains the answers. On the other hand they are classic tools to prevent

fraud. Furthermore, they can be utilized to identify multi-victim frauds. Answers to security

questions provided valuable data points because fraudsters often use a pattern of passwords.

3.6 IP Geocoding and Mapping. Collecting and analyzing IP addresses used to access

the system is a mission critical component to fighting government benefit fraud. Collecting IP

information enables quicker identification of multi-victim scenarios. IP addresses can be mapped

down to the zip code level. This information can provide critical intelligence to investigators

(Altoff, 2008). Likewise, it enables agencies to block known threats (blacklisting). There are

limitations to IP geocoding especially with wireless service providers and users that utilize

burner devices.


4. Predictive Fraud Modeling and Analytics

IP address, machine fingerprinting and MAC IDs would ideally be used together to

strengthen fraud detection (increased accuracy) while also reducing false positives (decreased

disruptions to real claimants). Furthermore, collecting these 3 pieces of information (network

events) gives investigators multiple avenues to explore and to investigate any fraudsters that

successfully gain access to the system.

The network events enumerated in Sections 3.1 through 3.6 are events that can be of

great probative value on their own. The network events are even more valuable if they are

captured and stored in a data warehouse3. If the aforementioned network events are captured

using forensically sound methods and stored in a data warehouse, then the scope and impact of

investigations increases exponentially. Simply put it allows analysts and investigators access to

the evidence.

Access to the evidence (in an unalterable state) allows analysts and investigators to both

identify and investigate frauds without fear of compromising the digital evidence. Therein is the

sweet spot: not only does implementing the aforementioned network infrastructure prevent fraud

but it also increases the rate and speed of detection. Furthermore, agencies can harness the power

of machine learning, predictive modeling and fraud analytics to make informed business

decisions (Cutts, 2015). Decisions properly formed based on machine learning, predictive

modeling and fraud analytics can be used to break the pay and chase cycle that many government

agencies face.

3 According to Designing Data Warehouses for Cyber Crimes, a data warehouse is, “a data repository that

contains historical data for effective data analysis and reporting processes. Data warehouses are designed to support

decision making by studying and analyzing complex sets of data.”


5. Conclusion

Government agencies make unique targets for cybercrime. As government agencies

continue to move toward serving customers in multi-channel environments, the need to know

their customer increases. Government agencies can utilize network forensics and big data to

build a layered defense in order to increase the rate of fraud prevention and detection.

Furthermore, the information arms investigators with the tools they need to investigate and

subsequently pursue prosecution.


References

Branded3. Altoff, P. (2008). Five Ways to Detect Fraud Using Geolocation. https://www.branded3.com/blog/five-ways-to-detect-fraud-using-geolocation/ CNET. Rosenblatt, S. (2013). Two-Factor Authentication: What You Need to Know

(FAQ). http://www.cnet.com/news/two-factor-authentication-what-you need-to-know-faq/

Department of Homeland Security. Branum,S. Gegick, M. and Michael, C. (2005).

Defense in Depth. Retrieved from https://buildsecurityin.us-cert.gov/articles/knowledge/principles/defense-in-depth

Fick, J. (2009). Prevention is Better than Prosecution: Deepening the Defense

against Cyber Crime. Journal of Digital Forensics, Security and Law, Vol. 4(4), 51-72. http://ojs.jdfsl.org/index.php/jdfsl/article/view/159/76

United States Department of Justice. (2015, April). Defendants Charged in Separate

Fraud Schemes that Resulted in Thousands of Identities Stolen and Used to Commit Fraud Schemes. Retrieved from http://www.justice.gov/usao-sdfl/pr/defendants-charged-separate-fraud-schemes-resulted-thousands-identities-stolen-and-used).

Kessler, G. (2010). Judges’ Awareness, Understanding, and Application of Digital Evidence. Journal of Digital Forensics, Security and Law, Vol. 6(1), 55-72. http://ojs.jdfsl.org/index.php/jdfsl/article/view/27

Li, X. (2008). The Criminal Phenomenon on the Internet: Hallmarks of Criminals and Victims Revisited Through Typical Cases Prosecuted. Social Science Research Network. http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2034529

Make Use Of. Slagan, S. (2013). What is a MAC Address & Cab it Be used to Secure

Your Home Network. http://www.makeuseof.com/tag/what-is-a-mac-address-can-it-be-used-to-secure-your-home-network-makeuseof-explains/

Sammons, J. (2012). The Basics of Digital Forensics. Boston: Syngress p. 2. Sindhu, K. & Meshram, B. (2012). Digital Forensics and Cyber Crime Data Mining.

Journal of Digital Forensics, Security and Law, Vol. 3, 196-201. http://www.estij.org/papers/vol2no12012/19vol2no1.pdf

Social Media Week. Garvey, A. (2015). The Oregon Trail Generation: Life Before

and After Mainstream Tech. http://socialmediaweek.org/blog/2015/04/oregon-trail-generation/

Steel, C. (2014). Idiographic Digital Profiling: Behavioral Analysis Based on Digital

https://www.branded3.com/blog/five-ways-to-detect-fraud-using-geolocation/

http://www.cnet.com/news/two-factor-authentication-what-you

https://buildsecurityin.us-cert.gov/articles/knowledge/principles/defense-in-depth

https://buildsecurityin.us-cert.gov/articles/knowledge/principles/defense-in-depth

http://www.justice.gov/usao-sdfl/pr/defendants-charged-separate-fraud-schemes-resulted-thousands-identities-stolen-and-used

http://www.justice.gov/usao-sdfl/pr/defendants-charged-separate-fraud-schemes-resulted-thousands-identities-stolen-and-used

http://ojs.jdfsl.org/index.php/jdfsl/article/view/27

http://www.makeuseof.com/tag/what-is-a-mac-

http://www.estij.org/papers/vol2no12012/19vol2no1.pdf


Forensics. Journal of Digital Forensics, Security and Law, Vol. (9)1. ojs.jdfsl.org/index.php/jdfsl/article/view/122/201

The Digital Marketing Glossary. What is Device Fingerprinting Definition?

http://digitalmarketing-glossary.com/What-is-Device-fingerprinting-definition

Branney-Gant Research Paper

Documents

Transcript of Branney-Gant Research Paper