This paper presents the formulation of the possibilistic Renyi entropy function from the Renyi entropy function using the framework of Hanman-Anirban entropy function. The new entropy function is used to derive the information set features from keystroke dynamics for the authentication of users. A new composite fuzzy classifier is also proposed based on Mamta-Hanman entropy function and applied on the Information Set based features. A comparison of the results of the proposed approach with those of Support Vector Machine and Random Forest classifier shows that the new classifier outperforms the other two.
With ever increasing use of computers, internet and online transactions, the need for access control and web security is necessitated. Various methods are used for secure access and online user authentication but most of them are afflicted with some drawbacks. Password/PIN based access control for user authentication is easy to be forged using brute force attack. Similarly, tokens like smart cards used for authentication get lost or easily stolen. So, biometric systems employing both physiological and behavioral modalities have recently gained popularity. The physiological biometrics comprising physical traits such as face, fingerprint, iris, palm-print, speech and hand geometry has gained popularity in recent years. Behavioral biometrics is based on the human physical activity like gait, voice, signature and keystroke dynamics. Keystroke Dynamics depicts a natural typing rhythm captured through keyboard available on most computing systems including hand held devices like mobile/PDAs which possess touch based keyboard. Keystroke Dynamics is a strong behavioral biometrics with many advantages and offers solution to many problems faced with other access control mechanisms. Some of the advantages include: It cannot be copied as it is difficult to copy the human behavior, and it cannot be stolen, forged or lost. As no special device is required, it is a low-cost biometrics solution. Keystroke Dynamics has high user acceptance and can be operated in hidden mode. It can also be used for continuous user authentication while user is working on the system. Moreover, keystroke dynamics based authentication is best suited for online user verification as keystroke features comprise not so large timing vector.
The features for keystroke dynamics mainly consist of timing data like time to move from one key to another also known as flight time and time for which a key is pressed also known as dwell time. Different researchers have used different timing features based on the above basic keystroke timings. Some researchers have used Down-Down Time, Up-Up Time, Up-Down and Down-Up Time where Down Time is the time instance when key is pressed and Up Time is the time instance when key is released. Similarly, Press-Press Time (PPTime), Press- Release Time (PRTime), Release-Press Time (RPTime) and Release-Release Time (RRTime) are used. Press Time is the same as Down Time and Release Time is the same as Up Time. In addition to timing features we can also include keystroke pressure, i.e. the pressure applied on the key, as part of keystroke dynamics features [
The text entry in the form of fixed string predetermined at the initial instance of user interaction with the authentication system for extracting keystroke dynamics is static. Text entry can also be dynamic where user types the free text for continuous authentication of a user. The static entry datasets are publicly available in Killorhy and Maxion [
User’s master profile is created based on keystroke dynamics behavior from username using trajectory dissimilarity technique in [
Killourhy and Maxion [
Deng and Zhong [
Teh et al. [
A hybrid model involving the fusion of Gaussian probability density function (GPDF) and SVM based scores is developed in [
Pisani et al. [
Ivannikova et al. [
Sliding windows of different sizes are used in [
The present work is concerned with the generation of information set features using the possibilistic Renyi entropy function from the keystroke dynamics comprising dwell time and flight time. Our previous work [
Though many classifiers falling under statistical methods, neural networks and pattern recognition techniques are in vogue for the authentication of a user using keystroke dynamics, we propose a new fuzzy classifier.
The organization of the paper is as follows: Section 2 gives the derivation for the possibilistic Renyi Entropy function. It also formulates the Information Set features and higher form of these features based on this entropy function. Section 3 develops an algorithm for the Composite Fuzzy classifier based on Composite convex Entropy function. Section 4 describes the databases used in the present work and Section 5 discusses the results of implementation. Section 6 gives the conclusions.
To represent the probabilistic uncertainty, we have several entropy functions such as Shannon, Pal and Pal [
H R = 1 1 − α log ( ∑ i = 1 n p i α ) (1)
To represent the possibilistic uncertainty, pi is replaced by Ti in (1). This leads to
H R = 1 1 − α log ( ∑ i = 1 n T i α ) (2)
The unknown parameter in (2) is constant but we take it as a variable in the range (0, 1) and derive in the next section the adaptive Renyi entropy function by relating it to the Hanman-Anirban entropy function [
H i = T i e − [ a T i 3 + b T i 2 + c T i + d ] (3)
where T i is the information source value and a, b, c, and d are the parameters in the exponential gain function. These parameters are selected to be statistical parameters such that this gain function becomes the Gaussian function. For this
the choice of parameters is: a = 0 , b = 1 2 σ 2 , c = − 2 T ¯ 2 σ 2 and d = T ¯ 2 2 σ 2 where T ¯ is the mean value. Then (3) becomes H i = T i μ i .
To bring (2) into the possibilistic domain, let us consider only ith term in the summation and α to be a variable. This leads to
H i = 1 1 − α i log ( T i α i ) = α i 1 − α i log ( T i ) (4)
Assuming the membership function value as μ i = 1 1 − α i , we have α i 1 − α i = − μ ¯ i . The membership function μ is taken to be Gaussian function
with its statistical parameters, the mean T ¯ and the variance σ 2 computed from the keystroke measurements { T } as explained above.
Now replacing this in (4) we have ith component of adaptive Renyi entropy function is:
H i = − μ ¯ i log T i (5)
This r.h.s. of this equation is represented in modus ponen form as: μ ¯ i → log T i . This in turn allows us to write − μ ¯ i ∪ log T i , which means we can write H i = max { − μ ¯ i , log T i } , though we have taken it as the product.
Replacing − log p = e 1 − p in (5), we get one term of the adaptive Renyi entropy function as:
H i = μ ¯ i e ( 1 − T i ) (6)
This is different from the entropy function term, T i α μ i β derived in our previous work [
The above is in the form given by
H i = f ( T i ) ⋅ e ( 1 − T i ) (7)
This is an information value in Hanman-Anirban entropy function for a = b = 0 , c = 1 , d = − 1 and replacing T i by a function of T as f ( T i ) = μ ¯ i . Thus the information source value is a complement membership function μ ¯ i and the gain function is exponential. We have shown that one term of Renyi function takes one specific form of the Hanman-Anirban entropy function. From (6) it is easy to form an information set: { μ ¯ i e ( 1 − T i ) } by varying the index i and the resulting possibilistic Renyi entropy function is:
H R p = ∑ i = 1 n μ ¯ i e ( 1 − T i ) (8)
Let the mean membership be μ = 1 n ∑ i = 1 n μ i and substituting this in (1) for α , we have
H R = 1 1 − μ log ( ∑ i = 1 n T i μ ) (9)
The difference Δ H R = H R − H R p is the error incurred in the approximation of Renyi function in the possibilistic domain.
1) Complement Renyi Function: By replacing μ ¯ with μ in Equation (6) we get Complement Renyi function as:
H i = μ i e ( 1 − T i ) (10)
The above can be written as H i = μ i ( 1 − 1 + T i ) = μ i T i . With the substitution of proper values for the parameters in (2), we get what we call the basic information value μ i T i . This is proved in [
2) Sigmoid Renyi Function: Considering Equation (6) as a unit of information, we will now apply a sigmoid function on it to get:
S i = 1 1 + e − μ ¯ i e ( 1 − T i ) (11)
3) Complement Sigmoid Function: Replacing μ with μ ¯ in (8), we get:
S i = 1 1 + e − μ i e ( 1 − T i ) (12)
4) Renyi Entropy Energy: This follows from (8) by multiplying it with μ ¯ .
H i = μ ¯ i 2 e ( 1 − T i ) (13)
5) Complement Renyi Energy: By taking complement of μ ¯ , we obtain this as:
H i = μ i 2 e ( 1 − T i ) (14)
6) Renyi Transform: Renyi entropy function is not amenable for conversion to transforms just as Hanman transform. When we put Renyi entropy function into the form of Hanman-Anirban entropy function, it offers us the facility to create transforms. Consider the Hanman-Anirban entropy function in the following form:
H i = f ( T i ) e − [ a T i 3 + b T i 2 + c T i + d ] (15)
where f ( T i ) = T i in the original Hanman-Anirban entropy function. But we take μ ¯ i = f ( T i ) to convert into the Renyi entropy function form. Further taking a = 0 , b = 0 , c = μ i and d = 0 we get the Renyi transform given by:
H i = μ ¯ i e − μ i T i (16)
To introduce non-linearity in the values of μ ¯ i we can modify it as a power of α
H i = μ ¯ i α e − μ i T i (17)
7) Complement Renyi Transform: Taking complement μ ¯ i in place of μ i we get Complement Renyi Transform as:
H i = μ i e − μ i T i (18)
8) Modified Sigmoid Renyi Function: Applying sigmoid function to Equation (5), we get Modified Sigmoid Renyi Function as:
S i = 1 1 + e − μ ¯ i log T i (19)
9) Modified Complement Sigmoid Renyi Function: Taking complement in Equation (19) by replacing μ with μ ¯ we get the modified complement sigmoid Renyi function as:
S i = 1 1 + e − μ i log T i (20)
In our previous work [
Algorithm [
Step 1: Calculate mean ( T a v g ( 1 ) ) and variance ( σ ( 1 ) ) of all the training samples.
Step 2: Calculate mean ( T a v g ( 2 ) ) and variance ( σ ( 2 ) ) of all the keystroke features in a single training sample.
Step 3: Compute μ ( 1 ) using T a v g ( 1 ) and σ ( 1 ) and similarly compute μ ( 2 ) using T a v g ( 2 ) and σ ( 2 ) . Next compute two components, I 1 = { μ i j ( 1 ) T i j } and I 2 = { μ i j ( 2 ) T i j } using μ ( 1 ) and μ ( 2 ) .
Step 4: Concatenate I1 and I2 to form I. Then train any classifier using concatenated I.
Step 5: Compute It1 using T a v g ( 1 ) and σ ( 1 ) from Step 1 for each test sample.
Step 6: Compute mean ( T a v g ( 2 ) ) and variance ( σ ( 2 ) ) of all the features in the test sample. Also compute It2 using T a v g ( 2 ) and σ ( 2 ) .
Step 7: Concatenate It1 and It2 to obtain It and feed this feature vector to any classifier.
Before proceeding to the design of a classifier we need the error vector between the training feature vector of lth user corresponding to mth training sample denoted by x m j l and the test feature vector t j . Let the size of the feature vector be n and the number of training feature vectors be s for each user. The error vector is computed from:
e m j l = | x m j l − t j | ; m = 1 , 2 , ⋯ , s , j = 1 , 2 , ⋯ , n (21)
As we need a membership in the formulation of a fuzzy classifier, we select an exponential membership function as:
μ m j l = e − ( | x m j l − t j | ) ; m = 1 , 2 , ⋯ , s , j = 1 , 2 , ⋯ , n (22)
In view of (21), Equation (22) is rewritten as
μ m j l = e − e m j l (23)
We now apply Frank t-norm (tF) on a pair of error vectors e m j l and e h j l to yield the normed error vector denoted by E m h l ( k ) as follows:
E m h l ( j ) = t F ( e m j l , e h j l ) ; m ≠ h , j = 1 , 2 , ⋯ , n (24)
In the above, t F is given by
t F = log q [ 1 + ( q e m j l − 1 ) ( q e h j l − 1 ) q − 1 ] ; k = 1 , 2 , ⋯ , V (25)
Similarly, we compute t-norm of a pair of membership functions μ m j l and μ h j l called the normed membership function using:
M m h l ( j ) = t F ( μ m j l , μ h j l ) , m ≠ h , j = 1 , 2 , ⋯ , n (26)
As proved in [
Derivation of Composite Entropy Function: For this derivation, we take recourse to Mamta-Hanman entropy function in the form:
H = ∑ j = 1 n T j α e − ( c T j γ + d ) β (27)
By substituting c = − 1 , d = 0 and β = 1 we obtain
H = ∑ j = 1 n T j α e T j γ (28)
with T j = E m h l ( j ) M m h l ( j ) . To develop the composite entropy function, we apply logarithmic function on (28) leading to
H = log ∑ j = 1 n T j α e T j γ (29)
The composite function is the result of applying logarithmic function on Mamta-Hanman entropy function. That is, we are modifying the entropy value by the logarithmic function. In this case the available information is Mamta- Hanman entropy value which we are modifying by applying logarithmic function. We will be making use of this composite function in the derivation of fuzzy classifier. To this end, an algorithm is outlined here.
Algorithm for the Composite Fuzzy Classifier
1) Find the error vector between the training feature vector and test feature vector for the lth user as:
e m j l = | x m j l − t j | ; m ≠ h , j = 1 , 2 , ⋯ , n
2) Compute the membership function vectors ( μ m 1 l , μ m 2 l , ⋯ , μ m n l ) ( ∀ m = 1 , 2 , 3 , ⋯ , s ) for the lth user as follows:
μ m j l = e − ( | x m j l − t j | ) ; m ≠ h , j = 1 , 2 , ⋯ , n
3) Compute the normed error vector E m h l ( ∀ m , h = 1 , 2 , ⋯ , s , m ≠ h , j = 1 , 2 , ⋯ , n ) for the lth user from:
E m h l ( j ) = t F ( e m j l , e h j l ) ; m ≠ h , j = 1 , 2 , ⋯ , n
4) Compute the t-norm of a pair of membership functions, M m h l ( ∀ m , h = 1 , 2 , ⋯ , s , m ≠ h , j = 1 , 2 , ⋯ , n ) for the lth user as follows:
M m h l ( j ) = t F ( μ m j l , μ h j l ) , m ≠ h , j = 1 , 2 , ⋯ , n
5) Compute H m h l using Composite entropy function
H m h l = log ( ∑ j = 1 n ( E m h l ( j ) M m h l ( j ) ) α ⋅ e ( E m h l ( j ) M m h l ( j ) ) γ )
6) Repeat Steps 1-4 for all users l = ( 1 , 2 , ⋯ , C ) and if k = min arg l { H l } , then the test sample belongs to kth user.
The above Renyi entropy features are applied on the publicly available dataset from CMU.
For the evaluation of the keystroke dynamics based authentication system, we have used the following publicly available dataset:
CMU Keystroke Dynamics Benchmark Dataset [
Data is collected from 51 users in 8 sessions and 50 repetitions of the same password are recorded in each session. So, for each user there are 400 samples. CMU benchmark dataset has keystroke features DD (Down-Down) time, UD (Up-Down) time and H (Hold) time. Each user has typed a 10-character password (“.tie5Roanl”). For the evaluation of Renyi Entropy based features, we have used Hold and Up-Down times. Therefore, we have 21 features which include: 11 Hold Time values for 10 characters and an enter key, 10 Up-Down Time values for latencies between 11 key release and subsequent key press.
Half of the samples for each user (i.e. 200 samples) is used as the training data and the remaining half is used for positive testing. Each user is considered as both genuine and imposter user; thus facilitating 51 × 50 experiments.
For the classification, three classifiers are employed. The first one is Random Forest Classifier in which ensemble of decision trees is generated based on the training data. The second is two-class SVM classifier with a linear kernel. The third is the proposed Composite Fuzzy Classifier inspired from the Hanman Classfier [
To evaluate the performance of the derived features, error rates, viz., FAR (False Acceptance Rate), FRR (False Rejection Rate), EER (Equal Error Rate) and authentication accuracy are calculated for each of 51 × 50 experiments and reported.
Some of the features of
The information set features derived from Renyi Entropy are applied on the Composite Fuzzy Classifier and the results are shown in
Feature | FAR | FRR | EER | Accuracy |
---|---|---|---|---|
Adaptive Renyi Function | 0.0114 | 0.0254 | 0.0153 | 0.9824 |
Complement Renyi Function | 0.0117 | 0.0258 | 0.0155 | 0.9820 |
Sigmoid Renyi Function | 0.0112 | 0.0258 | 0.0152 | 0.9823 |
Complement Sigmoid Renyi Function | 0.0118 | 0.0255 | 0.0155 | 0.9821 |
Energy Renyi Feature | 0.0117 | 0.0247 | 0.0153 | 0.9825 |
Complement Energy Renyi Feature | 0.0112 | 0.0271 | 0.0153 | 0.9818 |
Renyi Transform | 0.0112 | 0.0261 | 0.0153 | 0.9822 |
Complement Renyi Transform | 0.0116 | 0.0257 | 0.0153 | 0.9821 |
Modified Sigmoid Renyi Feature | 0.0114 | 0.0256 | 0.0153 | 0.9823 |
Modified Complement Sigmoid Renyi Feature | 0.0117 | 0.0277 | 0.0163 | 0.9812 |
Feature | FAR | FRR | EER | Accuracy |
---|---|---|---|---|
Adaptive Renyi Function | 0.0187 | 0.0428 | 0.0283 | 0.9706 |
Complement Renyi Function | 0.0191 | 0.0441 | 0.0290 | 0.9698 |
Sigmoid Renyi Function | 0.0197 | 0.0412 | 0.0279 | 0.9708 |
Complement Sigmoid Renyi Function | 0.0202 | 0.0436 | 0.0293 | 0.9694 |
Energy Renyi Feature | 0.0218 | 0.0460 | 0.0312 | 0.9675 |
Complement Energy Renyi Feature | 0.0201 | 0.0438 | 0.0291 | 0.9694 |
Renyi Transform | 0.0185 | 0.0424 | 0.0285 | 0.9709 |
Complement Renyi Transform | 0.0194 | 0.0430 | 0.0291 | 0.9701 |
Modified Sigmoid Renyi Feature | 0.0183 | 0.0418 | 0.0272 | 0.9712 |
Modified Complement Sigmoid Renyi Feature | 0.0218 | 0.0458 | 0.0310 | 0.9675 |
Feature | FAR | FRR | EER | Accuracy |
---|---|---|---|---|
Adaptive Renyi Function | 0.0125 | 0.0190 | 0.0144 | 0.9846 |
Complement Renyi Function | 0.0144 | 0.0186 | 0.0153 | 0.9837 |
Sigmoid Renyi Function | 0.0113 | 0.0222 | 0.0148 | 0.9838 |
Complement Sigmoid Renyi Function | 0.0166 | 0.0198 | 0.0167 | 0.9820 |
Energy Renyi Feature | 0.0171 | 0.0268 | 0.0196 | 0.9786 |
Complement Energy Renyi Feature | 0.0119 | 0.0241 | 0.0149 | 0.9827 |
Renyi Transform | 0.0137 | 0.0180 | 0.0146 | 0.9844 |
Complement Renyi Transform | 0.0141 | 0.0189 | 0.0152 | 0.9838 |
Modified Sigmoid Renyi Feature | 0.0106 | 0.0248 | 0.0149 | 0.9831 |
Modified Complement Sigmoid Renyi Feature | 0.0181 | 0.0241 | 0.0199 | 0.9793 |
the best performance with Adaptive Renyi Function for EER of 0.0144 and an accuracy of 0.9846.
Now we will compare the performance of Composite Fuzzy Classifier with SVM and Random Forest in terms of ROC curves. EER is computed by taking the mean of EERs from 51 × 50 experiments and their ROC curves. So, the comparison of ROC curves is shown for one experiment for the user 20 and imposter 11 of CMU dataset.
ROC curves for the above derived information set features for user number 20 with imposter 11 are shown in Figures 1-10.
In almost all the cases presented above, the proposed composite fuzzy classifier clearly outperforms SVM and Random Forest Classifiers in terms of both error rates and ROC curves.
We have presented an approach for the authentication of users based on keystroke dynamics using the Information set features derived from the adaptive
Renyi entropy function by establishing its connection with Hanman-Anirban function. This in turn has paved the way in deriving several features in similar lines with the already existing information set features based on Hanman-Anirban entropy function. The feature vectors of a particular feature type corresponding to samples of each user are arranged in matrix form. Using columns as representing the spatial information component and rows as representing the temporal information, Two-Component information set (TCIS) features are derived. Thus TCIS features for all feature types are obtained.
For the development of composite entropy function the log function is applied on the Mamta-Hanman entropy function in which the product of the T-normed error value and T-normed-membership function value is considered as the information source value. Thus we have made use of the higher form of Mamta- Hanman entropy function. This composite entropy function is converted into a composite fuzzy classifier. Its performance is compared with that of Random forest classifier (Treebagger) and SVM. The best results are obtained with Adaptive Renyi entropy features using Composite fuzzy classifier. The results due to Random Forest and SVM are slightly inferior.
We hope the new features will find applications in different domains.
Bhatia, A. and Hanmandlu, M. (2018) Keystroke Dynamics Based Authentication Using Possibilistic Renyi Entropy Features and Composite Fuzzy Classifier. Journal of Modern Physics, 9, 112-129. https://doi.org/10.4236/jmp.2018.91008