A Hybrid Web Recommendation System Based on the Improved Association Rule Mining Algorithm

doi:10.4236/jsea.2013.68049

Paper Menu >>

Journal Menu >>

Journal of Software Engineering and Applications, 2013, 6, 396-404

http://dx.doi.org/10.4236/jsea.2013.68049 Published Online August 2013 (http://www.scirp.org/journal/jsea)

A Hybrid Web Recommendation System Based on the

Improved Association Rule Mining Algorithm

Ujwala H. Wanaskar1, Sheetal R. Vij2, Debajyoti Mukhopadhyay3

1Department of Computer Engineering, Padmabhooshan Vasantdata Patil Institute of Technology, Pune, India; 2Department of

Computer Engineering, Maharashtra Institute of Technology, Pune, India; 3Department of Information Technology, Maharashtra

Institute of Technology, Pune, India.

Email: ujwalaw.267@gmail.com, sheetal.sh@gmail.com, debajyoti.mukhopadhyay@gmail.com

Received June 18th, 2013; revised July 16th, 2013; accepted July 23rd, 2013

cense, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

ABSTRACT

As the growing interest of web recommendation systems those are applied to deliver customized data for their users, we

started working on this system. Generally the recommendation systems are divided into two major categories such as

collaborative recommendation system and content based recommendation system. In case of collaborative recommen-

dation systems, these try to seek out users who share same tastes that of given user as well as recommends the websites

according to the liking given user. Whereas the content based recommendation systems tries to recommend web sites

similar to those web sites the user has liked. In the recent research we found that the efficient technique based on asso-

ciation rule mining algorithm is proposed in order to solve the problem of web page recommendation. Major problem of

the same is that the web pages are given equal importance. Here the importance of pages changes according to the fre-

quency of visiting the web page as well as amount of time user spends on that page. Also recommendation of newly

added web pages or the pages that are not yet visited by users is not included in the recommendation set. To overcome

this problem, we have used the web usage log in the adaptive association rule based web mining where the association

rules were applied to personalization. This algorithm was purely based on the Apriori data mining algorithm in order to

generate the association rules. However this method also suffers from some unavoidable drawbacks. In this paper we

are presenting and investigating the new approach based on weighted Association Rule Mining Algorithm and text

mining. This is improved algorithm which adds semantic knowledge to the results, has more efficiency and hence gives

better quality and performances as compared to existing approaches.

Keywords: Web Recommender System; Association Rules; Web Mining; Text Mining

1. Introduction

With the web2.0 introduced, its use is growing up along

with high speed development in infrastructure and ser-

vices. Several opportunities, like sharing information and

opinion with different users, came out. This did favor the

event of social networks like Facebook. Nowadays, au-

thors will share their creations with numerous readers

round the globe.

Amateur-musicians will get renowned faster than ever

before simply with uploading their tracks. Business world

have found a lot of customers and profit within the net.

The range of on-line retailers, auctions or entozoan mar-

kets can be displayed within the net. Today, each user of

the World Wide Web can buy virtually any item being in

any country of the planet without any place-limitations.

In fact, there’s virtually endless place where we dis-

covered a new problem within the computer universe.

The number of data and things got very vast, resulting in

associate degree data overload. It became an enormous

drawback to seek out what the user is really probing for.

Search engines partly solved that problem; but personal-

ization of data was not considered.

Therefore, the system developers found the solution in

recommender systems. Recommender systems are tools

for filtering and sorting things and data. They use opin-

ions of a community of users to assist people to deter-

mine content of interest from a doubtless overwhelming

set of decisions. There’s an enormous diversity of algo-

rithms and approaches that facilitate making personalized

recommendations. Two of them became very popular:

collaborative filtering and content-based filtering. They

are used as a base of latest recommender systems.

A Hybrid Web Recommendation System Based on the Improved Association Rule Mining Algorithm 397

Appearance of mobile devices with new technologies,

like GPS and 3G standards, in the market issued new

challenges. Recommender systems got concerned in de-

veloping method of touristy, security and alternative ar-

eas. Trendy recommender systems are raising their rec-

ommendations accuracies by exploitation context-aware,

semantic and alternative approaches. Today, recommen-

dations are a lot of specific and personalized too. Issues

of combining completely different technologies and re-

commending approaches for higher results can invariably

exist and can be the area of interest of latest researchers.

Most of the website recommender systems that were

planned earlier utilized cooperative filtering. Cooperative

filtering is commonly utilized in general product recom-

mender systems, and consists of the subsequent stages.

The foremost stage in cooperative filtering is to investi-

gate users purchase histories so as to extract user teams

that have similar purchase patterns. Then suggest the pro-

ducts that are usually most popular within the user’s clus-

ter.

Basically opinions of community members used by the

Recommender Systems (RS) in order to facilitate people

to establish the knowledge possibly to be fascinating to

them or pertinent to their desires. This will be achieved

by drawing on user preferences and filtering the set of

possible choices to a lot manageable set. Each internet

Recommendation System has its own blessings and limi-

tations.

Moreover the assignment of advocated system is to

recommend things that match a user’s style, so as to as-

sist the user in selecting/purchasing things from a devas-

tating set of selections. Such systems have huge impor-

tance in applications like e-commerce, subscription pri-

marily based services, info filtering, internet services etc.

Recommendations are generated based on two ele-

mentary approaches. First are content based approaches

in which the profiles are users and things by distinguish-

ing their characteristic options, such as demographic in-

formation for user identification, and product informa-

tion/descriptions for item profiling. The profiles are util-

ized by algorithms to unite user interests and item de-

scriptions once generating recommendations (Takacs et

al.). Online page Recommendation is an energetic appli-

cation space for information filtering, internet Mining

and Machine Learning analysis. Another approach is co-

operative recommendation that tries to seek out some

users who share similar tastes with the given user and

recommends websites they prefer to that user.

In this paper we are presenting a new association rule

mining approach based on weighted association rule min-

ing. We are using the weighted association rule mining

algorithm in order to overcome the drawbacks of existing

approaches. This approach helps user to obtain the web

sites which are most relevant to them.

In the following Section 2 we will discuss the different

types of recommendation approaches along with their

advantages and disadvantages. Section 3 presents the pro-

posed approach for web page recommendation.

2. Literature Review

2.1. Traditional Recommendation Approaches

2.1.1. Content -B ased Filtering

Content-based recommender systems work with profiles

of users that are created at the beginning. A profile has

information about a user and his taste which is based on

how user rates the items. Generally, when creating a pro-

file, recommender systems make a survey, to get initial

information about a user in order to avoid the new-user

problem [1].

In the recommendation process, the engine compares

items that were already positively rated by user with

items he did not rate and looks for similarities. Those

items that are mostly similar to the positively rated ones,

will be recommended to the user. Content-based recom-

mender systems mostly use tags or keywords for efficient

and better filtering. In this case the profiles of other users

are not essential and they don’t influence the recommen-

dations of the user, as the recommendations are based on

individual information. Going in details of methods of

collaborative filtering, we can distinguish most popular

approaches: user-based, item-based and model-based ap-

proaches.

2.1.2. Collaborative Filtering

Collaborative filtering became one of the most researched

techniques of recommender systems since this approach

was mentioned and described by Paul Resnick and Hal

Varian in 1997 [2]. The idea of collaborative filtering is,

finding the users in a community that share appreciations

[3]. If two users have same or almost same rated items in

common, then they have similar tastes. Such users build

a group or a so called neighborhood. A user gets recom-

mendations to choose items that he/she has not rated be-

fore, but that were already positively rated by users in

his/her neighborhood.

Collaborative filtering is widely used in e-commerce.

Customers can rate books, songs, movies and then get

recommendations regarding those issues in future. More-

over collaborative filtering is utilized in browsing of cer-

tain documents (e.g. documents among scientific works,

articles, and magazines) [4].

Following Figures 1 and 2 shows the user-based and

item based methods respectively:

2.1.3. Hybrid Recommendation Approaches

For better results some recommender systems combine

different techniques of collaborative approaches and con-

tent based approaches. Using hybrid approaches we can

A Hybrid Web Recommendation System Based on the Improved Association Rule Mining Algorithm

398

Figure 1. User based approach.

Figure 2. Item based approach.

avoid some limitations and problems of pure recommen-

der systems, like the cold-start problem. The combination

of approaches can proceed in different ways:

1) Separate implementation of algorithms and joining

the results;

2) Utilize some rules of content-based filtering in col-

laborative approach;

3) Utilize some rules of collaborative filtering in con-

tent based approach;

4) Create a unified recommender system that brings

together both approaches.

Robin Burke worked out taxonomy of hybrid recom-

mender systems categorizing them [5].

2.2. Modern Recommendation Approaches

2.2.1. Context -Aware Approaches

Context is the information about the environment of a

user and the details of situation he/she is in. Such details

may play much more significant role in recommenda-

tions than ratings of items, as the ratings alone don’t have

detailed information about under which circumstances

they were given by users. Some recommendations are

more suitable to the user in evening and doesn’t match

his preferences in the morning at all and he/she would

like to do one thing when it’s cold and completely an-

other when it’s hot outside. The recommender systems

that pay attention and utilize such information in giving

recommendations are called context-aware recommender

systems.

One of the biggest problems of context-aware recom-

mender systems is obtaining context information. The in-

formation can be obtained explicitly by directly interact-

ing with user asking him/her to fill out a form and mak-

ing a survey. Although it is mostly desirable to obtain

context information without making the whole rating and

reviewing process complicated. Another way is gathering

information implicitly using the sources like GPS, to get

location, or a timestamp on transaction [6]. The last way

of information extraction is analyzing users and observ-

ing their behavior or using data mining techniques.

2.2.2. Semantic Based Approaches

Most of the descriptions of items, users in recommender

systems and the rest of the web are presented in the web

in a textual form. Using tags and keywords without any

semantic meanings doesn’t improve the accuracy of re-

commendations in all cases, as some keywords may be

homonyms. That is why understanding and structuring of

text is a very significant part recommendation. Tradi-

tional text mining approaches that base on lexical and

syntactical analysis show descriptions that can be under-

stood by a user but not a computer or a recommender

system. That was a reason of creating new text mining

techniques that were based on semantic analysis. Recom-

mender systems with such techniques are called semantic

based recommender systems.

The performance of semantic recommender systems

are based on knowledge base usually defined as a con-

cept diagram (like taxonomy) or ontology.

2.2.3. Cross- D om ain Based Approaches

Finding similar users and building an accurate neighbor-

hood is an important part of recommending process of

collaborative recommender systems. Similarities of two

users are discovered based on their appreciations of items.

But similar appreciations in one domain do not surely

mean that in another domain valuations are similar as

well.

2.2.4. Peer-to-Peer Approaches

The recommender systems with P2P approaches are de-

centralized. Each peer can relate itself to a group of other

peers with same interests and get recommendations from

the users of that group. Recommendations can also be

given based on the history of a peer. Decentralization of

recommender system can solve the scalability problem

[7]

2.2.5. Cross-lingual App roache s

The recommender system based on cross-lingual ap-

proach lets the users receive recommendations to the

items that have descriptions in languages they don’t

speak and understand. Yang, Chen and Wu purposed an

approach for a cross lingual news group recommendation.

The main idea is to map both text and keywords in dif-

ferent languages into a single feature space, that is to say

A Hybrid Web Recommendation System Based on the Improved Association Rule Mining Algorithm 399

a probability distribution over latent topics. From the

descriptions of items the system parses keywords than

translates them in one defined language using dictionar-

ies. After that, using collaborative or other filtering, the

system gives recommendations to users [8].

2.3. Challenges and Issues of Recommendation

Approaches

2.3.1. Cold -Start

It’s difficult to give recommendations to new users as his

profile is almost empty and he hasn’t rated any items yet

so his taste is unknown to the system. This is called the

cold start problem. In some recommender systems this

problem is solved with survey when creating a profile.

Items can also have a cold-start when they are new in the

system and haven’t been rated before. Both of these pro-

blems can be also solved with hybrid approaches.

2.3.2. Trust

The voices of people with a short history may not be that

relevant as the voices of those who have rich history in

their profiles. The issue of trust arises towards evalua-

tions of a certain customer. The problem could be solved

by distribution of priorities to the users.

2.3.3. Scalability

With the growth of numbers of users and items, the sys-

tem needs more resources for processing information and

forming recommendations. Majority of resources is con-

sumed with the purpose of determining users with similar

tastes, and goods with similar descriptions. This problem

is also solved by the combination of various types of

filters and physical improvement of systems. Parts of nu-

merous computations may also be implemented offline in

order to accelerate assurance of recommendations online.

2.3.4. Sparsity

In online shops that have a huge amount of users and

items there are almost always users that have rated just a

few items. Using collaborative and other approaches re-

commender systems generally create neighborhoods of

users using their profiles. If a user has evaluated just few

items then it’s pretty difficult to determine his taste and

he/she could be related to the wrong neighborhood. Spar-

sity is the problem of lack of information [9].

2.3.5. Pri vacy

Privacy has been the most important problem. In order to

receive the most accurate and correct recommendation,

the system must acquire the most amount of information

possible about the user, including demographic data, and

data about the location of a particular user. Naturally, the

question of reliability, security and confidentiality of the

given information arises. Many online shops offer effec-

tive protection of privacy of the users by utilizing spe-

cialized algorithms and programs.

3. Our Approach and Basics

In this paper we are trying to describe, analyze, imple-

ment and upgrade the mostly used method for web min-

ing i.e. association rule mining [10,11]. This technique

can be easily used in recommendation systems and it is

scalable [12,13]. This method gives high precision [14],

and only gives binary weight to the pages that are visited

i.e. to find whether the page is present or not. Usually if

page is present means it is considered important. It is

possible that not all the pages visited by the user are of

his interest. User may visit a page but it may not have

useful information for him. So factors like time spent by

the user and visiting frequency of the page should be

considered for the page consideration [15]. Because if

user finds it interesting then only user will spend more

time on it or user will visit that page frequently [16]. So

in association rule mining method the weight of the page

is also included. This is called weighted association rule

mining.

The Page weight is calculated by using following for-

mulae [14],



 



Total Duration

Size

Duration =Total duration

max Size

(1)

No. of Visit()1

Frequency()* In degree()

No. of visit()







(2)

 





 

2 * Freqiency* Duration

Weight Frequency Duration

ppp

 (3)

Text Mining is a process of analyzing the text. It is

useful in finding meaningful information from the given

text. This technique adds semantic knowledge to the data.

It is used in many applications like information retrieval,

language processing, Data mining etc. There are many

algorithms used for text mining. But most popular algo-

rithm is TF-IDF. TF-IDF is term frequency and inverse

document frequency. It is used to find out weighting fac-

tor i.e. how important is a word in the given document. It

is used in the search engine and also to find out relevance

of the documents by ranking them. This mining tech-

nique is used in the last step in the proposed system prior

to generating the recommendations. By using this mining

we actually filter out the pages that are highly relevant to

the search query.

A Hybrid Web Recommendation System Based on the Improved Association Rule Mining Algorithm

400

4. Proposed Algorithm candidate set;

Step 4: For the current user session give results in the

previous step to the text mining which will generate

some results. Sort the results given by step 3 and step 4

to generate final recommendation set.

The application like web personalization is nothing but

the application of machine learning techniques as well as

data mining in order to build user behavior models. This

application is basically used to identify the requirements

of user and future interactions adapting with the main

goal of increased user satisfaction.

The above steps are described as follows:

After the text edit has been completed, the paper is

ready for the template. Duplicate the template file by us-

ing the Save As command, and use the naming conven-

tion prescribed by your journal for the name of your pa-

per. In this newly created file, highlight all of the con-

tents and import your prepared text file. You are now

ready to style your paper.

The web recommendation systems represent unique as

well as important class of the personalized web applica-

tions. The focus over the user based filtering as well as

relevant information selection. Many techniques such as

clustering based approaches, content based filtering, se-

quence and association rule based etc. are used.

In this paper we are giving hybrid recommendation

approach which uses web usage mining and text mining.

We are presenting the new data mining approach which

is based on HITS and weighted association rule mining

for the efficient web recommendation system. This me-

thod is used for providing the user a personalized web

experience.

4.1. Cluster the Pages Based on Users’ Usage

Pattern

In this algorithm, the web pages are clustered not from

the content but from the pattern of their usage, assuming

that users find a page very interesting and is important

for the user so finds their actions [16]. In the next step we

use the result of this step to extend the recommendation

set. In this step we try to cluster the pages that occur to-

gether across the sessions. Page cluster group together

frequently occurring items even though they do not seem

to be similar. This step forms clusters with overlapping

interest of different types of users.

In this method, quantitative weight is assigned to each

page according to amount of time the user spends on that

page or the frequency of visiting that page [16]. Gener-

ally while recommending the data there is a problem of

rarely visited pages or newly added pages as they would

never be added to the recommendation set. So to over-

come this problem in our approach we are including

these pages in the data set by using HITS algorithm [16].

HITS algorithm is used to extend the data set as well as

to rank the pages. Below Figure 3 is showing the archi-

tecture for proposed web recommendation method.

4.2. Generating the Seed Recommendation Set

In this step, firstly weighted association rule of each

URL is found out from web log data, these rules repre-

sents users navigation on the web. Secondly, the recom-

mendation engine will search the top-n most similar

weighted rules to the active user session before generat-

ing recommendation for the user. During the second part

instead of exact match between the active user and rules,

we use a similarity measure for finding the most similar

rules.

As shown in above Figure 3, the steps in the algo-

rithms could be briefly summarized as follows:

Step 1: Cluster the pages based on users’ usage pat-

tern;

Step 2: Generate the seed recommendation set based

on Weighted Association Rules Mining;

Step 3: Extend the seed set by clusters to generate the

candidate set and apply the HITS algorithm to rank the Mining Weighted Association Rules:

Figure 3. Architecture of proposed approach.

A Hybrid Web Recommendation System Based on the Improved Association Rule Mining Algorithm 401

Each transaction consists of the set of pages, the asso-

ciation rule is of the form,

Ywhere

I,

, , Where X and Y are the two itemsets,

X is the body of the rule and Y head of the rule. In this

method we associate weight parameter with each page to

reflect the interest of user which is called as weighted

association rule mining. In this technique we just extend

the traditional Apriori [17] Algorithm by adding weight

as one of the parameter.

YIXY

Current user session is represented as vector S of sig-

nificance weight if user has accessed the page, si = 0

otherwise. After this we find out the match score be-

tween association rules that generated based on naviga-

tional pattern history and current active session. This

match score is calculated as follows [16],

  



 



Dissimilarity ,

irLi

wsi wrLi

srL wsi wrLi





















(4)

i: 0

1Dissimilarity( ,)

Match Score (,)141

rLi

srL



  (5)













Rec,Match Score,*wconfSX pSXX p

(6)

where s is current user session, rL are the rules generated

and wconf is the weighted confidence.

This recommendation system is an online component

of personalization system which determines which items

to be recommended to the user. The recommendation

score is calculated by multiplying match score and

weighted confidence. Finally top n-most similar pages

are sorted to be used in the next phase.

4.3. Extending the Seed Set and Apply Hits

Generally problem with the recommendation system is

that its recommendation accuracy decreases as dataset

increases also rarely visited pages or newly added pages

are not included in the recommendation set. These pages

should be included in the recommendation set otherwise

they would never get recommended.

To overcome this problem we have used seed recom-

mendation set generated in the previous step as input to

this step. We extend this set to generate candidate set.

For each page in the seed set candidate is supplemented

with the pages that are in the same cluster. A graph is

generated from the pages included in the candidate set by

connecting them with the link that exists, results in the

connectivity graph which represents improved naviga-

tional pattern. This process of obtaining the connectivity

graph is same as that used by HITS algorithm [18] to find

Authority and Hubs. So we take the advantage of HITS

algorithm to identify hubs and authority pages within that

clusters which allows us to rank pages within the clusters.

Here only Hub measures are considered as it may link to

many authority pages [19]. Using this hub value we will

rank the candidate set pages in online module to form

Match Score I.

4.4. Apply Text Mining on Results and Generate

Final Recommendation

In this step, the text mining is done for more approxi-

mated results. The TF-IDF algorithm is used for this. The

results of the above step as well as the page results for

the current user session are given as the input to this

stage. Now the results given by this stage and previous

step are sorted to generate final recommendation set.

4.5. HITS Mathematical Model

According to Jon Kleinberg’s a higher hi number as be-

ing better hubs. Given the weights {ai} and {hi} of all

the nodes in SQ, we dynamically update the weights as

shown in the following Figure 4.

A good hub increases authority weight of pages it

points. A good authority increases the hub weight of the

pages that point to it. The idea is then to apply the two

operations above alternatively until equilibrium values

for the hub and authority weights are reached. Let A be

the adjacency matrix of graph SQ and denote the author-

ity weight vector by v and the hub weight vector by u,

where

(a)

(b)

Figure 4. Dynamic weight updation.

A Hybrid Web Recommendation System Based on the Improved Association Rule Mining Algorithm

402

 

 



 

 

(7)

The two update operations described as

vAu

uAv



 (8)

If we consider that initial weights of the nodes are

uvA

 

 



 

 

(9)

Then, after k steps we get:



vAAv

uAAu





  (10)

Algorithm 1: Under the assumptions that AAt and AtA

are primitive matrices, following statements hold:

1) If v1, ... , vk is the sequence of authority weights we

have computed, then v1, ..., vk converges to the unique

probabilistic vector corresponding to the dominant eigen

value of the matrix AtA. With a slight abuse of notation,

we denoted in here by vk the vector vk normalized so that

the sum of its entries is 1.

2) Likewise, if u1, ..., uk are the hub weights that we

have iteratively computed, then u1, ..., u converges to the

unique probabilistic vector corresponding to the domi-

nant eigen value of the matrix AAt. We use the same no-

tation, that k

, where c is the scalar equal to

the sum of the entries of the vector uk.

(1 /)

Uc

So authority weight vector is the probabilistic eigen-

vector corresponding to the largest eigenvalue of AtA,

while hub weights of the nodes are given by the prob-

abilistic eigenvector of the largest eigenvalue of AAt:

Algorithm:

1) The matrices AAt and AtA are real and symmetric, so

they have only real eigenvalues.

2) Perron Frobenius theorem. If M is a primitive ma-

trix, then:

 The largest eigen value λ of M is positive and of mul-

tiplicity 1;

 Every other eigen value of M is in modulus strictly

less than λ;

 The largest eigen value λ has a corresponding eigen

vector with all entries positive.

3) Let M be a non-negative symmetric and primitive

matrix and v be the largest eigenvector of M, with sum of

its entries equal to 1. Let z be the column vector with all

entries non-negative, then, if we normalize the vectors z,

Mz, ..., Mkz, and then the sequence converges to v.

HITS algorithm is in the same spirit as PageRank.

They both make use of the link structure of the Web

graph in order to decide the relevance of the pages. The

difference is that unlike the PageRank algorithm, HITS

only operates on a small subgraph (the seed SQ) from the

web graph. This subgraph is query dependent; whenever

we search with a different query phrase, the seed changes

as well. HITS rank the seed nodes according to their au-

thority and hub weights. The highest ranking pages are

displayed to the user by the query engine.

5. Experimental Analysis

To evaluate the effectiveness of the method, performance

is measured using two factors like precision and cover-

age [18]. Recommendation precision means number of

correct recommendations i.e. proportion of relevant re-

commendations to the total number of recommendations.

Precision is given by the formula,

(() ())

precision ()

Tp Rp

 (11)

Coverage of the system is the proportion of relevant

recommendations to the all pages that should be recom-

mended. Where R(p) is recommendation set and T(p) is

session. Precision of the recommendations are measured

for varying number of recommended pages. So based on

above proposed system we have worked on practical

evaluation using the JAVA, J2EE. We have done imple-

mentation through the web application as shown in the

following Figure 5.

Following graphs are showing the performance eva-

luation of the existing algorithm i.e. based on only

weighted association rule mining and our proposed ap-

proach based on the weighted association rule mining

Figure 5. Search results of web page recommendation sys-

tem.

A Hybrid Web Recommendation System Based on the Improved Association Rule Mining Algorithm 403

and text mining. The following table readings and graphs

show the improved performance of as compared to ex-

isting cases.

Comparative study of precision rate between proposed

and existing method based on number of pages ranked.

Tables 1 and 2 shows the readings we got during our

practical analysis and Figures 6 and 7 shows the graph

Figure 6. Precision comparative analysis.

Figure 7. Coverage comparative analysis.

Table 1. Precision comparison readings.

Precision

No of Recommended

Pages Proposed Algorithm Previous Algorithm

1 100 100

2 50 33.33

3 100 100

4 100 87.5

5 80 60

6 58.33 50

7 57.14 50

8 85.71 62.5

9 33.33 27.77

Table 2. Coverage comparison readings.

Coverage

No of Recommended

Pages Proposed Algorithm Previous Algorithm

1 50 50

2 66.66 66.66

3 65.5 16.66

4 100 50

5 100 83.33

6 57.14 50

7 100 83.33

8 62.5 16.66

9 33.33 20

13 62.5 38.46

for those readings.

The table and the following graph shows that the pro-

posed approach shows improved results as compared to

the previous approach. So approach is efficient as com-

pared to the previous existing approaches.

6. Conclusion and Future Scope

This paper proposes as new web recommendation system

based on weighted association rule mining and text min-

ing. In this approach, weight is assigned to each page to

show its importance depending on the time spent by each

user on a particular page or visiting frequency of each

page. To add semantic knowledge to the data to be rec-

ommended, text mining is used.

One of the challenging problem with recommendation

system is that pages that are newly added or rarely vis-

ited. They are generally not included in the recommenda-

tion set, so in this approach we have added these pages to

the recommendation page set. The performance of the

system is evaluated under different settings and in com-

parison with the previous method which is based only on

the weighted association rule mining

Web recommendation system is used to recommend

pages to users. The application can be used for personal-

ized recommendation to give personalized recommenda-

tion based on users browsing history. This system can be

used in search engines to give recommendation to users

based on the users search keyword so that system will

make proper recommendation, filtering unrelated infor-

mation. Recommendations can be given for individual

sites/ports or generalized sites.

Throughout this paper we have discussed many as-

pects of research for web recommendation systems. We

have presented related work, problems associated with

A Hybrid Web Recommendation System Based on the Improved Association Rule Mining Algorithm

404

existing methods as well as literature study over various

research methods in the same domain. Based on existing

limitations, in this paper new mining approach based on

combination of weighted association rule mining and text

mining is presented which is showing the better perfor-

mance improvement as compared to the existing methods.

For the work we suggest to apply this method under

cloud computing environment.

REFERENCES

[1] D. Dubois, E. Hüllermeier and H. Prade, “A Systematic

Approach to the Assessment of Fuzzy Association

Rules,” Data Mining and Knowledge Discovery Journal,

Vol. 13, No. 2, 2006, pp. 167-192.

doi:10.1007/s10618-005-0032-4

[2] P. Resnik and H. Varian, “Recommender System,” Com-

munication of the ACM, Vol. 40, No. 3, 1997.

[3] G. Takacs, I. Pilaszy, B. Nemeth and D. Tikk, “Scalable

Collaborative Filtering Approaches for Large Recom-

mender Systems,” The Journal of Machine Learning Re-

search, Vol. 10, 2009, pp. 623-656.

[4] S. Orlando, R. Perego and C. Silvestri, “A New Algo-

rithm for Gap Constrained Sequence Mining,” Proceed-

ings of the ACM Symposium on Applied Computing,

Nicosia, 14-17 March 2004, pp. 540-547,

[5] R. Burke, “Hybrid Recommender System: Survey and

Experiments,” User Modeling and User-Adapted Interac-

tion, Vol. 12, No. 4, 2002, pp. 331-370

[6] Baltrunas and X. Amatriain, “Towards Time-Dependant

Recommendation Based on Implicit Feedback,” Work-

shop on Context-Aware Recommender Systems, New

York, October 2009.

[7] Y. Shavitt, W. Ela and W. Udi, “Building Recommender

Syterm Using Peer to Peer Shared Content,” CIKM’10,

Toronto, 25-29 October 2010.

[8] C.-Z. Yang, I.-X. Chen and P.-J. Yu, “Cross Lingual

News Group Recommendation Using Cluster Based cross

Training,” Computational Linguistic and Chinese Lan-

guage Processing, Vol. 13, No. 1, 2008, pp. 41-60.

[9] B. Sarwar, G. Karypis, J. Konstan and J. Reidl, “Item-

Based Collaborative Filtering Recommendation Algo-

rithms,” In: Proceedings of the 10th International Con-

ference on World Wide Web, ACM, New York, 2001, pp.

285-295.

[10] H. Wang and S. M. Thao, “A Study on Personalized Web

Browsing Recommendation Based on Data Mining and

Collaborative Filtering Technology,” Proceedings of Na-

tional Computer Symposium, Taiwan, 2003, pp. 18-25.

[11] Gery and H. Haddad, “Evaluation of Web Usage Mining

Approaches for User’s Next Request Prediction,” Pro-

ceedings of the Fifth ACM International Workshop on

Web Information and Data Management, New Orleans,

7-8 November 2003, pp. 74-81.

[12] B. Mobasher, R. Cooley and J. Srivastava, “Automatic

Personalization Based on Web Usage Mining,” Commu-

nications of the ACM, Vol. 43, No. 8, 2000, pp. 142-151.

doi:10.1145/345124.345169

[13] B. Mobasher, H. Dai, T. Luo and M. Nakagawa, “Effec-

tive Personalization Based on Association Rule Discov-

ery from Web Usage Data,” Proceedings of the 3rd ACM

Workshop on Web Information and Data Management

(WIDM01), Atlanta, 5-10 November 2001.

[14] B. Mobasher, “Web Usage Mining and Personalization,”

In: Practical Handbook of Internet.

[15] C. Shahabi, A. Zarkesh, J. Abidi and V. Shah, “Knowl-

edge Discovery from User’s Web-Page Navigation,”

Proceedings of the 7th IEEE International Workshop on

Research Issues in Data Engineering, Birmingham, 7-8

April 1997.

[16] R. Forsati, M. Meybodi and A. Rahbar, “An Efficient

Algorithm for Web Recommendation Systems,” AICCSA

2009 IEEE/ACS International Conference on Computer

Systems and Applications, Rabat, 10-13 May 2009, pp.

579-586. doi:10.1109/AICCSA.2009.5069385

[17] R. Agrawal and R. Srikant, “Fast Algorithms for Mining

Association Rules in Large Databases,” Proceedings of

the 20th International Conference on Very Large Data

Bases VLDB’94, Santiago, 12-15 September 1994, pp.

487-499.

[18] J. M. Kleinberg, “Authoritative Sources in a Hyperlinked

Environment,” Journal of the ACM, Vol. 46, No. 5, 1999,

pp. 604-632. doi:10.1145/324133.324140

[19] O. Zaϊane, J. Li and R. Hayward, “Mission-Based Navi-

gational Behavior Modeling for Web Recommender Sys-

tem,” Springer-Verlag, Berlin, Heidelberg, 2007.