Artificial Immune
System
for
Computer Security
The
threats and intrusions in IT systems can basically be compared to human
diseases with the difference that the human body has an effective way to deal with
them, what still need to be designed for IT systems. The human immune system
(HIS) can detect and defend against yet unseen intruders, is distributed,
adaptive and multilayered to name only a few of its
features.
Our immune system incorporates a powerful and diverse set of characteristics
which are very interesting to use in AIS . in AIS I am working on computer
security . as I think security should be our first priority.
WHAT IS AIS
Artificial Immune Systems (AIS)
is a branch of biologically inspired computation focusing on many aspects of
immune systems. AIS development can be seen as having two target domains: the
provision of solutions to engineering problems through the adoption of immune
system inspired concepts; and the provision of models and simulations with
which to study immune system theories.
KEY
WORDS
AIS ,
immunue system , artificial immune system, virus, negative selection model , Hierarchical Artificial Immune Model
How AIS related
with biological immune system
In medical science, historically,
the term immunity refers to the condition in which an organism can resist
disease, more specifically infectious disease. However, a broader definition of
immunity is a reaction to foreign (or dangerous) substances.
Immunology concerns the study of
the immune system and the effects of its operation on the body. The immune
system is normally defined in relation to its perceived function: a defence
system that has evolved to protect its host from pathogens (harmful
micro-organisms such as bacteria, viruses and parasites) [Goldsby et al. 2003].
It comprises a variety of specialised cells that circulate and monitor the
body, various extra-cellular molecules, and immune organs that provide an
environment for immune cells to interact, mature and respond. The collective
action of immune cells and molecules forms a complex network leading to the
detection and recognition of pathogens within the body. This is followed by a
specific effector response aimed at
eliminating the pathogen. This
recognition and response process is vastly complicated with many of the details
not yet properly understood.
Human
Immune System Components
Bio and
Artificial Immune mapping
Biological Immune
System
|
Artificial Immune System
|
Human Body
|
Computer network
|
Organisms/ Organs
|
Nodes / Files
|
Antibodies
|
Mobile Agents
|
Antigens
|
Software Virus
|
Immunity,
Suppression
|
Immunity, Tolerance
|
Neural Controller
|
Server
|
Immune memory
|
Look up Table
|
Training patterns
|
Virus Signatures
|
Receptors
|
Detectors
|
Bio Connectivity
|
Wireless/ Wired
Link
|
Organ address
|
IP Address
|
Time of Attack
|
Time of Virus
Detection
|
Cloning Agent
|
Replication
|
Recovery Time
|
Agent Life Time
|
Natural Immunity
|
Built –in Security
|
Acquired Immunity
|
Agent based
Security
|
Natural Death
|
Dead PC
|
What
Motivated Them?
Why is it
that engineers are attracted to the immune system for inspiration? The immune
system exhibits several properties that engineers recognise as being desirable
in their systems. [Timmis & Andrews 2007, Timmis et al. 2008a, de Castro
& Timmis 2002a] have identified these as:-
1)Distribution
and self-organization:-
The
behavior of the immune system is deployed through the actions of billions of
agents (cells and molecules) distributed throughout the body. Their collective
effects can be highly complex with no central controller. An organised response
emerges as a system wide property derived from the low level agent behaviours.
These immune agents act concurrently making immune processes naturally parallised.
2)Learning,
adaption, and memory.
The
immune system is capable of recognizing previously unseen pathogens, thus
exibits the ability to learn. Learning implies the presence of memory, which is
present in the immune system enabling it to ‘remember’ previously encounted
pathogens. This is encapsuatled by the phenomenon of primary and secondary
responses: the first time a pathogen is encountered an immune response (the
primary response) is elicited. The next time that pathogen is encounted a faster
and often more aggressive response is mounted (the secondary response).
3)Pattern
recognition.
Through
its various receptors and molecules the immune system is capable of recognising
a diverse range of patterns. This is accomplished through receptors that
perceive antigenic materials in differing contexts (processed molecules, whole
molecules, additional signals etc). Receptors of the innate immune system vary
little, whilst receptors of the adaptive immune system, such as as antibodies
and T-cell receptors are subject to huge diversity.
4)Classification
The
immune system is very effective at distinguishing harmful substances (non-self)
from the body’s own tissues (self), and directing its actions accordingly. From
a computational perspective, it does this with access to only a single class of
data, self molecules [Stibor et al. 2005]. Creation of a system that
effectively classifies data into two classes, having been trained on examples
from only one, is a challenging task.
Different models of Artificial Immune Systems
Artificial
Immune Systems (AIS) emerged in the 1990s as a new branch in Computational
Intelligence (CI).A number of AIS models exist, and they are used in pattern
recognition, fault detection, computer security, and a variety of other
applications researchers are exploring in the field of science and engineering
. Although the AIS research has been gaining its momentum, the changes in the
fundamental methodologies have not been dramatic. Among various mechanisms in
the biological immune system that are explored as AISs, negative selection,
immune network model and clonal selection are still the most discussed models.
But now I am going to focusing only on Negative selection , as it has
huge application on computer security .
Negative
Selection
Negative
selection is a process of selection that takes place in the thymus gland. T
cells are produced in the bone marrow and before they are released into the
lymphatic system, undergo a maturation process in the thymus gland. The
maturationof the T cells is conceptually very simple. T cells are exposed to
self-proteins in a binding process. If this binding activates the T cell, then
the T cell is killed, otherwise it is allowed into the lymphatic system. This
process of censoring prevents cells that are reactive to self from entering the
lymph system, thus endowing (in part) the host’s immune system with the ability
to distinguish between self and non-self agents.
Artificial
Negative Selection
The
negative selection algorithm Forrest et al. , is one of the computational
models of self/nonself discrimination, first designed as a change detection
method. It is one of the earliest AIS algorithms that were applied in various
real-world applications. Since it was first conceived, it has attracted many
AIS researchers and practitioners and has gone through some phenomenal
evolution. In spite of evolution and diversification of this method, the main
characteristics of a negative selection algorithm described by Forrest et al.
In
generation stage, the detectors are generated by some random process and
censored by trying to match self samples. Those candidates that match are
eliminated and the rest are kept as detectors. In the detection stage, the
collection of detectors (or detector set) is used
to check
whether an incoming data instance is self or non-self.
If it
matches any detector, then it is claimed as non-self or anomaly. This
description is limited to some extent, but conveys the essential idea. Like any
other Computational Intelligence technique, different negative selection
algorithms are characterized by particular representation schemes, matching
rules and detector generation processes.
AIS Applications
Artificial
Immune Systems (AIS) are being used in many applications
such as:-
1)anomaly
detection
2)pattern
recognition
3)data
mining
4)computer
security
5)adaptive
control
6)fault
detection .
Computer
Security
I am
working on computer security only . I choose this as because computer security
should be our first priority .world has
become a more interconnected place. Electronic communication, e-commerce,
network services and the Internet have become vital components of business strategies, government operations,
and private communications. Many organizations have become dependent on the
wired world for their daily activities. This interconnectivity has also brought
forth those who wish to exploit it. Computer security has, thus, become a
necessity in the digital age. While information dependence is increasing, the
threat from malicious code, such as computer viruses, is also on the rise. The
number of computer viruses has been increasing exponentially from their first
appearance in 1986 to over 55 000 different strains identified today . Viruses
were once spread by sharing disks; now, global connectivity allows malicious
code to spread farther and faster. Similarly, computer misuse through network intrusion is on the rise.
With the
rapid development of computer technology, new anti-malware technologies are
required because malware is becoming more complex with a faster propagation
speed and a stronger ability for latency, destruction, and infection.
Many
companies have released anti-malware software, most of which is based on
signatures and can detect known malware very quickly. However, the software
often fails to detect new variations and unknown malware. Based on metamorphic
and polymorphous techniques, even a layman is able to develop new variations of
known malware easily using malware automaton. Thus, traditional malware
detection methods based on signatures are no longer suitable for new
environments; as well, heuristics have started to emerge.
For the
past few years, applying immune mechanisms to computer security has developed
into a new field, attracting many researchers. Forrest applied immune theory to computer
abnormality detection for the first time in 1994 . Since then, many
researchers have proposed various different malware detection models and
achieved some success.
Immunological
computation has also been applied to other problem domains, not all of which
are in the computer-security field. Some
of the more interesting examples include anomaly
detection in time series data , fault diagnosis , decision support systems
,multi optimization problems , robust scheduling , and loan application fraud
detection . The similarity in all of
these
applications is that they utilize the
pattern-matching and “learning” mechanisms of the immune system model to
perform desired system features. A lot of theoretical groundwork
in
immunological computation has been completed, but only a handful of AISs have been build.
Many AIS
MODELS are there to detect virus & malware code.
For virus
detection
A Hierarchical Artificial Immune Model for Virus
Detection
Model Architecture
The model is composed
of two modules:
1)virus
gene library
2)generating
module
3)self-nonself
classification module.
virus gene library
The first
module is used for the training phase, whose
function
is to generate a detecting gene library to accomplish
the
training of given data.
A.Generating
module
This
module is assigned as the detecting phase in terms of the results from first
module for detection of the suspicious programs. we all know that in biology
the genetic information is
mainly
stored in DNA, but not all the fragments in DNA can express useful information.
Only gene is a fragment of DNA with genetic information. Gene is made up of
several deoxyribonucleotides (ODN)..
• DNA: The whole bit-string
of a procedure.
• Gene: Virus detector, a
fragment of virus DNA, the
compared
unit for virus detection.
• ODN: Every two bytes of
a bit-string.
The
relation of DNA, gene and ODN is shown
DNA
ODN
|
ODN
|
ODN
|
ODN
|
ODN
|
ODN
|
ODN
|
ODN
|
ODN
|
Gene is a fragment of DNA
which contains genetic information._
A series of ODNs compose a gene.
The relationship among DNA, gene &
ODN.
The codes
of a virus correspond to the DNA in the
organism.
small quanity of codes which will perform as Viral code & will regard as the genes of a virus.
These virus genes are composed of several virus ODNs which are the smallest
unit to analyze the virus. . At this stage, the most important task of the
model is to extract the genes of a virus.
B. Virus Gene Library Generating Module
Virus
gene library generating module works on the training
set
consisted of legal and virus programs.
Firstly,
this module is to count the ODNs in a DNA of legal and virus programs by a
sliding window, respectively, in order to extract ODNs which are regarded as
the representative of the virus. A virus ODN library is built by the obtained
statistical information.
Secondly, the DNAs in virus and legal programs are traversed by
the ODNs in the virus ODN library to
generate virus candidate gene library and legal virus-like gene library.
Finally, according to the negative selection mechanism, we match all the
genes in the candidate virus gene
library with the genes in the legal virus-like
gene library, and delete those genes which appear in both libraries. In such a way, the candidate library is upgraded as the detecting virus gene library.
2)
Candidate virus gene library:
The basic storage
block in the virus candidate gene library is virus sample. All the genes
in each sample are stored to make different genes in one virus storage
and genes in different virus storage separately. This kind of storage
mode is called signature storage on individual level in this
paper. The gene library mentioned below would apply this storage mode to
keep the
relevance
between different extracted genes in a same virus. Comparison between programs can be made on individual level with integrated information of virus
signatures. The model uses
continuous matching to match the virus DNA
with ODNs in the virus ODN library. It means, from the first matching position, that a sliding window is employed to move forward until a mismatching happens.
Then the number, of which ODNs in the virus ODN library take part in the
matching from the beginning to the end is recorded. If this number is larger
than a presenting threshold
3) Detecting virus gene library:
Using the
same method for generating the candidate virus gene library, this model can also be used to
generate a legal virus-like gene library by matching the legal programs
with ODNs in the virus ODN library.
Taking
the legal virus-like genes as self, and the candidate virus genes as nonself,
the NSA is applied to generate the detecting virus gene library.
It is a fuzzy
matching method, allowing some faults in matching.
C. Self-Nonself Classification Module
Repeating
the method that generates candidate virus gene library, the ODNs in the detecting virus gene library are used to generate the suspicious virus-like
gene library. Then we match virus-like genes in the suspicious program with
Matching degree between two genes:
This
module still use T-successive
consistency matching for two genes’ matching
Suspicious program detection
If the suspicious
program matches with each virus sample in the detecting virus gene library, the
similarity value is calculated. All the values for this program are added together as the similarity
value between the program and detecting virus gene library.
Summarized –
In the
above whatever I have written, that all are I have studied from either some books
or research papers. But now I am giving my idea based on this. What I have learnt.
Whatever I have written below is purely based on my idea. Something different.
Negative
Selection Algorithm (NSA) an algorithm for change detection based on the principles
of self-nonself discrimination (by T cell receptors) in the immune system. The
receptors can detect antigens. Partition of the Universe of Antigens SNS: self
and nonself .
Illustration
of NS Algorithm:
Match or Don’t
Match Self
Let r=2
1011 1011
Strings
(S) 1000 1101
There
exists efficient BNS algorithm that runs on linear time with the size of self .Efficient
algorithm to count number of binary numbers.
Generate
a set R of detectors, each of which fails to match any string in S.
Monitor
new observations (of S) for changes by continually testing the detectors
matching against representatives of S. If any detector ever matches, a change (or
deviation) must have occurred in system behavior.
Partial
matching rule –
string of
length l=20 , matching r=5
01010011001100010101
01110011011100011001
Anomaly
detection-
110011
|
10110
|
11000
|
……………………………
|
110001
|
Symbolically representation
of binary or alphabet
slide window for patter recognisation
CODE for detect the
viral code & legal code -
let Ni = Legal_code
let Nj = pseudo_ code
let No = Viral_code
creating a training set &
comprised of self pattern
initially Ni != Nj
& Ni !=
No
for(i=0;I<10;i++)
for(j=0;j<=10;j++)
use sliding window principle
if Ni match with Nj
& Ni
mismatch with No
then Nj = legal code
& No = viral code
end
CONCLUSION :-
Here I have learnt that using
negative
selection algorithms are characterized by particular representation schemes,
matching rules and detector generation processes. Many models are there to recognize
the virus & malicious codes.
This is
just my summarized one. My original work yet not completed, even if whatever I
have written here, just like a summery. My complete work may take more time. Here
I have given just fundamental idea based on AIS ON COMPUTER SECURITY.
This algorithm is self written
(without any help / copy) may be mistake is there. As I have not complete my
work fully. I hope in my future work I can give better algorithm.
This is my minor project for 7th
sem. hope I’ll get chance to research on it in my future. Just praying before
my God. Even if I’ll continue it in my 8th sem. hope may something
new I can show you further.
When my work will be complete
after that I can show you my whole work. Till now it’s near about 55 pages. I
don’t know how much time it will take & how many pages. Hope for the best.
You may get my whole work after
one month; means fully correct one & purely my work. This project is done
by me (alone). For this I wanna show my special gratitude towards my professors
who ever helped me / help me here.
REFERENCES
[1] P. S. Deng, J. Wang, W. Shieh et al. “Intelligent
automatic malicious
code signatures extraction”, IEEE
37th Annual 2003 International
Carnahan Conference on
Security Technology, pp. 600-603.
[2] K. P. Anchor, P. D. Williams, G. H. Gunsch et al. “The Computer
Defense Immune System: Current and Future Research in Intrusion
Detection”, Evolutionary Computation, 2002, pp.
1027-1032.
[3] J. O. Kephart. “A Biologically Inspired Immune System for
Computers”,
in Artificial Life IV, Proceedings of the
Fourth International Workshop
on the Synthesis and
Simulation of Living Systems, 1994, pp. 130-139.
[4] S. Forrest, A. S. Perelson, L. Allen et al. “Self - Nonself Discrimination
in a Computer”, Security and Privacy, Oakland CA,
pp. 202-212, 1994.
[5] P. D’haeseleer, S. Forrest, P. Helman. “An immunological approach to
change detection: algorithms, analysis, and implications”, Proceedings
of IEEE Symposium on
Research in Security and Privacy, Oakland, CA,
pp. 110 - 119, May 1996.
[6] H. Lee, W. Kim, M. Hong. “Artificial Immune System against Viral
Attack”, ICCS 2004, Lecture Notes in Computer Science 3037, pp. 499-
506, 2004.
[7] K. S. Edge, G. B. Lamont, R. A. Raines. “A retrovirus inspired
algorithm for virus detection & optimization”, 8th Annual Genetic and
Evolutionary Computation
Conference, Seattle WA, 2006, pp. 103-110.
[8] T. Li. Computer Immunology, Beijing:
Publishing house of electronics
industry, pp. 187-191, 2004.
[9] D. Dasgupta, N. Attoh-Okine. “Immunity-Based Systems: A survey”,
1997 IEEE International
Conference on Systems, Man, and Cybernetics,
Computational Cybernetics and Simulation, 1997, pp. 369-374.
[10] P. K. Harmer, P. D. Williams, G. H. Gunsch et al. “An Artificial
Immune System Architecture for Computer Security Applications”,
IEEE Transactions on
Evolutionary Computation, vol. 6(3), pp. 252-
280, 2002.
[11] M. D. Preda, M. Christodorescu, S. Jhaet al. “A Semantics-Based Approach
to Malware Detection”, 34th
Annual Symposium on Principles
of Programming Languages, vol. 42(1),
pp. 377-388, 2007.
[12] O. Henchiri, N. Japkowicz, J. Nathalie. “A Feature Selection and
Evaluation Scheme for Computer Virus Detection”, Sixth International
Conference on Data Mining, Hong Kong,
China, 2006, pp. 891-895.
[13]Beer, R.D., Chiel, H.J. and Sterling, S., A Biological
Perspective on Autonomous Agent Design, In Robotics and
Autonomous systems, Vol. 6, (1990), 169 – 186.
[14] Dasgupta, D, Artificial Immune Systems and Their
Applications, Heidelberg, Germany: Springer-Verlag, 1999.
[15] Dasgupta, D., An artificial immune system as a multi-agent
decision support system, Proc.
IEEE Int. Conf. Systems, Man
and Cybernetics ,(Oct. 1998),
pp. 3816–3820.
[16] David Kotz and Robert S. Gray, Mobile Agents and the
Future of the Internet, ACM
Operating Systems Review,
(Aug. 1999), 7-13.
[17] Desel, J., and Reisig, W., Place/Transition Petri Nets. In
Lecture on Petri nets I: Basic Models, vol 1491 of Lecture
Notes in Computer Science, Springer -
Verlag, 1998.
[18] Forrest S., Perelson A.S., Allen L., and Cherukuri, R., Self–
Nonself Discrimination in a Computer, Proceedings
of the
IEEE Symposium on Research
in Security and Privacy(Los
Alamos, CA: IEEE Computer Society Press), 1994.
[19] Goel, S and Bush S.F., Biological Models of Security for
Virus Propagation in Computer Networks login:, vol. 29, no.
6, (Dec. 2004), 49-56.
[20] Kaariboga Mobile Agents (Sep. 2003). [Online]. Available:
http:// http://www.projectory.de/kaariboga/index
[21] Kephart, J.O., Biologically Inspired Defenses against
Computer Viruses, Proceedings of IJCA ’95, (1995) 985–
996.
[22] Paul K. Harmer et al, An Artificial
Immune System
Architecture for Computer Security Applications, IEEE
Transactions on
Evolutionary Computation, vol. 6, no. 3,
(Jun. 2002), 252 – 280.
[23] Virus Information and Statistics, [Online]. Available: http://
http://www.avira.com/en/threats/
Proceedings of the World Congress on
Engineering 2008 Vol I
WCE 2008, July 2 - 4, 2008, London, U.K.
ISBN: