Paper:
Anomaly Detection as a Reputation System for Online Auctioning
November 10,
2005
Existing reputation systems used by online auction
houses do not address the concern of a buyer
shopping for commodities---finding a good
bargain. These systems do not provide information on
the practices adopted by sellers to ensure
profitable auctions. These practices may be
legitimate, like imposing a minimum starting bid on
an auction, or fraudulent, like using colluding
bidders to inflate the final price in a practice
known as shilling.
We develop a reputation system to help buyers
identify sellers whose auctions seem
price-inflated. Our reputation system is based upon
models that characterize sellers according to
statistical metrics related to price inflation. We
combine the statistical models with anomaly
detection techniques to identify the set of
suspicious sellers. The output of our reputation
system is a set of values for each seller
representing the confidence with which the system
can say that the auctions of the seller are
price-inflated.
We evaluate our reputation system on 604 high-volume
sellers who posted 37,525 auctions on eBay. Our
system automatically pinpoints sellers whose
auctions contain potential shill bidders. When we
manually analyze these sellers' auctions, we find
that many winning bids are at about the items'
market values, thus undercutting a buyer's ability
to find a bargain and demonstrating the
effectiveness of our reputation system.
Paper:
String Analysis for x86 Binaries
Mihai Christodorescu, Nicholas Kidd, Wen-Han Goh
September 6,
2005
Information about string values at key points in a program can
help program understanding, reverse engineering, and
forensics. We present a static-analysis technique for
recovering possible string values in an executable program,
when no debug information or source code is available. The
result of our analysis is a regular language that describes a
superset of the string values possible at a given program
point. We also impart some of the lessons learned in the
process of implementing our analysis as a tool for recovering
C-style strings in x86 executables.
Paper:
Semantics-Aware Malware Detection
Mihai Christodorescu, Somesh Jha, Sanjit A. Seshia, Dawn Song, Randal E. Bryant
May 9,
2005
A malware detector is a system that attempts to determine
whether a program has malicious intent. In order to evade
detection, malware writers (hackers) frequently use
obfuscation to morph malware. Malware detectors that use a
pattern-matching approach (such as commercial virus scanners)
are susceptible to obfuscations used by hackers. The
fundamental deficiency in the pattern-matching approach to
malware detection is that it is purely syntactic and ignores
the semantics of instructions. In this paper, we present a
malware-detection algorithm that addresses this deficiency by
incorporating instruction semantics to detect malicious
program traits. Experimental evaluation demonstrates that our
malware-detection algorithm can detect variants of malware
with a relatively low run-time overhead. Moreover, our
semantics-aware malware detection algorithm is resilient to
common obfuscations used by hackers.
Presentation:
Testing Malware Detectors
July 12,
2004
In today's interconnected world, malware, such as worms and
viruses, can cause havoc. A malware detector (commonly known
as virus scanner) attempts to identify malware. In
spite of the importance of malware detectors, there is a
dearth of testing techniques for evaluating them. We present
a technique based on program obfuscation for generating
tests for malware detectors. Our technique is geared towards
evaluating the resilience of malware detectors to various
obfuscation transformations commonly used by hackers to
disguise malware. We also demonstrate that a hacker can
leverage a malware detector's weakness in handling
obfuscation transformations and can extract the signature
used by a detector for a specific malware. We evaluate three
widely-used commercial virus scanners using our techniques
and discover that the resilience of these scanners to
various obfuscations is very poor.
Paper:
Testing Malware Detectors
Mihai Christodorescu, Somesh Jha
July 12,
2004
In today's interconnected world, malware, such as worms and
viruses, can cause havoc. A malware detector (commonly known
as virus scanner) attempts to identify malware. In
spite of the importance of malware detectors, there is a
dearth of testing techniques for evaluating them. We present
a technique based on program obfuscation for generating
tests for malware detectors. Our technique is geared towards
evaluating the resilience of malware detectors to various
obfuscation transformations commonly used by hackers to
disguise malware. We also demonstrate that a hacker can
leverage a malware detector's weakness in handling
obfuscation transformations and can extract the signature
used by a detector for a specific malware. We evaluate three
widely-used commercial virus scanners using our techniques
and discover that the resilience of these scanners to
various obfuscations is very poor.
Presentation:
Static Analysis of Executables to Detect Malicious Patterns
August 7,
2003
Malicious code detection is a crucial component of any
defense mechanism. In this paper, we present a unique
viewpoint on malicious code detection. We regard malicious
code detection as an obfuscation-deobfuscation game between
malicious code writers and researchers working on malicious
code detection. Malicious code writers attempt to obfuscate
the malicious code to subvert the malicious code detectors,
such as anti-virus software. We tested the resilience of
three commercial virus scanners against code obfuscation
attacks. The results were surprising: the three commercial
virus scanners could be subverted by very simple obfuscation
transformations! We present an architecture for detecting
malicious patterns in executables that is resilient to
common obfuscation transformations. Experimental results
demonstrate the efficacy of our prototype tool, SAFE (a
static analyzer for
executables).
Paper:
Static Analysis of Executables to Detect Malicious Patterns
Mihai Christodorescu, Somesh Jha
February 6,
2003
Malicious code detection is a crucial component of any
defense mechanism. In this paper, we present a unique
viewpoint on malicious code detection. We regard malicious
code detection as an obfuscation-deobfuscation game between
malicious code writers and researchers working on malicious
code detection. Malicious code writers attempt to obfuscate
the malicious code to subvert the malicious code detectors,
such as anti-virus software. We tested the resilience of
three commercial virus scanners against code obfuscation
attacks. The results were surprising: the three commercial
virus scanners could be subverted by very simple obfuscation
transformations! We present an architecture for detecting
malicious patterns in executables that is resilient to
common obfuscation transformations. Experimental results
demonstrate the efficacy of our prototype tool, SAFE (a
static analyzer for
executables).
Presentation:
General Purpose Binary Rewriting
January 27,
2003
A description of the architecture for a binary rewriting tool.
Presented at the
MURI Meeting, January 2003, Williamsburg, VA, USA
.
Presentation:
Detection of Malicious Code Patterns in Executables
via Model Checking
July 12,
2002
A description of a malicious code detection tool that uses
model checking ideas. We apply this tool to detecting viruses
and explain our results.
Presented at the
MURI Meeting, July 2002, Harpers Ferry, WV, USA
.
Presentation:
Virus Scanning as Model Checking
January 14,
2002
A description of a virus scanner that uses model checking
ideas.
Presented at the
MURI Meeting, January 2002, Washington DC
.
Presentation:
Hardware Security Support in General Purpose Processors
December 10,
2001
A performance analysis of the XOM architecture.
Written part of the coursework for
UWisc CS 752 Advanced Computer Architecture, Fall 2001.
Presentation:
Model Checking for Binaries
November 11,
2001
A quick look into model checking of binaries, with an
interesting application: virus scanning.
Paper:
Playing Inside the Black Box:
Using Dynamic Instrumentation to Create Security Holes
February 9,
2001
Programs running on insecure or malicious hosts have often
been cited as ripe targets for security attacks. The
enabling technology for these attacks is the ability to
easily analyze and control the running program. Dynamic
instrumentation provides the necessary technology for this
analysis and control. As embodied in the DynInst API
library, dynamic instrumentation allows easy construction of
tools that can: (1) inspect a running process, obtaining
structural information about the program; (2) control the
execution of the program, (3) cause new libraries to be
dynamically loaded into the process' address space; (4)
splice new code sequences into the running program and
remove them; and (5) replace individual call instructions or
entire functions.
With this technology, we have provided two demonstrations of
its use: exposing vulnerabilities in a distributed
scheduling system (Condor), and bypassing access to a
license server by a word processor (Framemaker). The first
demonstration shows the danger of remote execution of a job
on a system of unknown pedigree, and the second
demonstration shows the vulnerabilities of software license
protection schemes. While these types of vulnerabilities
have long been speculated, we show how, with the right tool
(the DynInst API), they can be easily accomplished. Along
with this discussion of vulnerabilities, we also discuss
strategies for compensating for them.
Presentation:
Bypassing License Checking Using Dyninst
March 24,
2000
Presentation on how we used Dyninst to bypass license checks.
I presented the work I have done, with a couple of
colleagues, on using Dyninst for dynamically modifying the
behavior of a program while it is running. In our case, the
modification we performed allowed us to bypass the checks
for valid license.
Presented at the
Paradyn Week,
2000
.
Presentation:
Active Storage Devices
March 6,
2000
A quick survey of the research in active devices.
Written part of the coursework for
UWisc CS 703 Advanced Topics in Programming Languages and
Compilers, Spring 2000.
Paper:
Opening Pandora's Box:
Using Binary Code Rewrite to Pypass License Checks
December 13,
1999
A common method of enforcing software license terms is for a
program to contact another program, called a license server,
and ask for permission to run. This project attempts to
bypass these license checks in a commercial product through
runtime code modification, using the DynInst library.
The programs chosen as victims for this study are Adobe
FrameMaker, the Purify family of programs, and MatLab. We
successfully bypass the FrameMaker licensing checks,
allowing full use of the product when the license server is
unavailable. Limitations in DynInst prevent similar results
with Purify or MatLab. A set of powerful tools has been
developed and used in the process, and their generality
should simplify similar license bypassing efforts on other
software products.
Published as
University of Wisconsin, Madison, Computer Sciences Department
Technical Report # 1479
.
Written part of the coursework for
UWisc CS 736 Advanced Operating Systems, Fall 1999.
Paper:
SunOS 5.6 Kernel Performance Measurements
September 28,
1999
I have performed time measurements on stock SunOS 5.6 /
Ultra SPARC IIi to gauge the overhead of basic system calls
(trivial calls, IPC, I/O, and memory-related). The
benchmarking in itself was a challenge, because from user
space there is tremendous indirection and associated
overhead to each function call (library or system).
The 64-bit system call gethrtime() was found to be
useful, with a high precision, very to the hardware clock
precision of 3.3 ns. Several kernel calls were compared to
find the kernel call overhead, and getpid() proved to
be the fastest one. Signals- vs. pipes-based IPC comparisons
clearly presented one-way unnamed pipe as a much better
alternative. Further on, the latency of page faults for
zero-filled pages revealed some high values, to be reckoned
with when allocating large amount of memory. Finally, I have
measured the basic read() system calls, along with
its counterparts fread() and mmap(), and
noticed that fread() performs much better that
read() and that mmap() performance recommends
it for small read buffer sizes.
The paper concludes by suggesting several ways to improve
performance based on the measurements obtained, as well as
by a short discussion on how to get relevant
measurements. The topic is by no means open to further
exploration.
Written part of the coursework for
UWisc CS 736 Advanced Operating Systems, Fall 1999.
Paper:
Illustration of GCC Compiler Optimizations for Ultra SPARC IIi
September 17,
1999
The following code optimizations are illustrated: Redundant
Expression Elimination (Common Subexpression Elimination),
Partially Redundant Expression Elimination, Constant
Propagation, Copy Propagation, Constant Folding, Dead Code
Elimination, Loop Invariant Code Motion, Scalarization (Scalar
Replacement), Local Register Allocation, Global Register
Allocation, Interprocedural Register Allocation, Register
Targeting, Interprocedural Code Motion, Call Inlining, Code
Hoisting and Sinking, Loop Unrolling, Strength
Reduction. Cache effects are briefly explained for data cache
conflict misses and instruction cache loop unrolling.
Written part of the coursework for
UWisc CS 701 Construction of Compilers, Fall 1999.