Advancement in Information Foraging Theory

doi:10.4236/iim.2012.46042

Intelligent Information Management
Vol.4 No.6(2012), Article ID:25189,7 pages DOI:10.4236/iim.2012.46042

Shailesh Khapre, M. S. Saleem Basha

●How to Cite this Article

Department of Computer Science, Pondicherry University, Pondicherry, India

Email: shaileshkhaprerkl@gmail.com, m.s.saleembasha@gmail.com

Received August 13, 2012; revised August 22, 2012; accepted October 16, 2012

Keywords: Information Foraging; Information Scent; Patch-Models; Diet-Models; Marginal-Value; Foraging Theory

ABSTRACT

This paper presents the advantages of information foraging theory matched with traditional information retrieval theory and user behavior analysis theory, a search content framework for information foraging theory is described, on a thorough review of the two research branches i.e. the basic concept of information foraging theory and the elementary models of information foraging theory, an extended framework is proposed,. Several problems for future research are also identified through.

1. Introduction

Foraging theory (Foraging Theory) was first used by ecologists and anthropologists, its purpose was to simulate and explain the behavior of some of the animals in the feeding process. Animals feeding process displayed a lot of interesting phenomena such as animal habitats at different times of different options; selecting different foods in different environments, with other animal, hence showing the great impact of foraging strategy on animal population. Darwinism showed that in the long process of evolution, the biological ability to adapt to the environment is ever-increasing, which is largely reflected by its own characteristics and the environment in which organisms can select the optimal foraging strategy.

Internet users in the rapid development of the Internet today, especially after the emergence of the information search engine, who are actively in search for information, their behavior and foraging behavior of animals is very similar, users need to spend time, money and effort to obtain the required information to achieve an optimal balance, this behavior can be called information foraging. Process to find and digest information, people need to adjust their information according to the information environment in which foraging strategy is to be applied so to maximize the information gains.

About the importance of information foraging, it has long been concern, Kofi Annan in the 1997 Global Knowledge Conference: “The information gap has become a new dividing line to distinguish whether those feeding into the path of sustainable development, while the remaining continuous backward” [1].

Information foraging theory was first proposed by Pirolli [2], Prior to this, PARC (Palo Alto Research Center) researchers conducted a more detailed and systematic study on information foraging theory [3-6], summary of the above materials, author states that: Information foraging theory used to explain and simulate people in a network environment, information search behavior model were used to simulate the user’s information search process, access and to calculate efficiency ,the calculated efficiency and user expectations were compared to simulate the user’s information foraging behavior in a particular environment.

2. Information Foraging Theory Background

Information foraging theory provides a number of methods and ideas that can effectively compensate for deficiency in traditional information retrieval and access to information theory. Through systematic analysis of the relevant research papers, we believe that the information foraging theory relative to the advantages of traditional information retrieval theory and user behavior analysis theory is mainly reflected in the following two aspects:

1) Research tasks and tasks of the traditional information retrieval model in which the environment is the precise definition of potential objectives and potential actions that have been defined, that is related tasks and tasks of environment has been artificially been booked in advance. Traditional, but the reality is: the user might try to change the target or change their information environment in which the retrieval process is carried out. Information foraging theory is that the network information shows plaque distribution, plaque (Patch) can be reflected to a certain level of structure, small web pages constitutes the structure of the lower level of the plaque, a major portal website constitutes higher-level structure. For the user’s information retrieval behavior to be transformed into foraging in a patch, he/she should continue in the area feeding information or to find the next Information plaques. Information foraging theory through the analysis of user-specific tasks and the information environment, the establishment of different patches in the network model to a certain extent can explain and simulate the process of transfer between different information plaques [7].

2) The traditional theory of information retrieval will retrieve the results, which are divided into related and not related as per the recall and precision evaluation of user search [8], but the reality is: When the user faces a large number of results, often only part of the contents is read. This is not because other information is irrelevant, but because of the repetition of the contents of this information with previous information, the contents on the back have the lower value to users. In fact, users do not know when to stop feeding the information in a particular environment; it’s often difficult to determine whether you can find valuable information at the right time. When users finds it difficult to continue to collect valuable information, that means that users are not satisfied with information in the environment or has spent a great price to get which in turn will trigger them to stop foraging behavior, or move to another environment to continue the search. Information foraging theory assumes that the user benefits can be quantified, by building information benefit curve using the idea of marginal gain to explain the average efficiency of access to information problem [9].

Although the continuous advances in cognitive science can make to a better understanding of the issues and solutions for network users, and then analyze the user’s specific behavior, but the costs and benefits of the traditional cognitive model, has not as the focus of research in the information search behavior. It has been biased in favor of more qualitative research; the lack of quantitative research is based on the understanding of the process on the user’s information search behavior model and concepts, and not from a quantitative point of view through mathematical modeling more in-depth study. Through a variety of surveys (such as questionnaires, log analysis, etc.) papers analyze the user’s information search behavior, although some quantitative analysis exists, but there is no abstract unified model experiment which is directed to a specific field of users information search behavior analysis, the lack of prediction of future behavior and to describe and understand a wider range of user behavior. Therefore, these theories are not fully applicable in the real network environment, information foraging behavior.

Information foraging theory viewed from a cognitive point of view i.e. the user’s information foraging process combined with the theory of information retrieval and access to information through the design of specific objectives and tasks, explains the behavior of users to obtain good results [10,11].

3. The basic Concept of Information Foraging Theory

3.1. The Concept of Information Clues

In the information foraging theory, the information clues [12] (Information Scent) is a very interesting concept, information clues means that the detection and use of suggestive words, such as the links on the World Wide Web can be seen as one of the most information clues. Link in a Web page is often accompanied by some explanatory text and pictures, these text and pictures can be seen as the link leads for information foraging on the network. Information clues as they link to information resources, the relationship is uncertain. In most cases, the user wants to obtain information which is not directly available online. Information clues play a very important role in the process of directing the user to query information in the information foraging process, with the accumulation of information clues, the user has to form a holistic understanding of the target in order to evaluate the searched-out information content. The main purpose of the information foraging theory is to analyze the network under the auspices of information clues foraging behavior.

Theoretical literature in the traditional animal feeding, usually assume a fact that: Predators develop their foraging decisions based on their body shape, habitat, and the types of food. Feeding information processes and animal foraging behavior of prey are very similar, they need to predict what more abundant resources environment has, and prey on the degree of difficulty [13] should also be considered in decision-making. Like other creatures, human classification is also available on the information they need (in terms of animal food). Information foraging theory assumes that the same information foraging is based on the understanding of the existing categories to their own minds and judgment of information available, combined with specific tasks in different network environment to develop appropriate feeding plan.

3.2. The Basic Theory of Information Foraging

The analysis of information clues in Information foraging theory is based on the following four theories:

1) Egon Brunswik’s “The lens model” (Lens Model) [14], the model that the human race is to evaluate or judge certain events through certain clues (Clue). The study of human application of these clues, as well as clues to the proportion of the foraging strategy, we can understand the strategy of assessment and judgment.

2) Anderson, the classification of adaptive theory (Adaptationist Theory of Categorization) [15], the theory describes how the organization have been observed to predict the contents of the properties not observed.

3) Anderson, Memory Adaptability Theory (Adaptationist Theory of Memory) [16], the theory describes how to retrieve the information they need from the existing information in the background.

4) McFadden, The Random Utility Model (Random Utility Model) [17], the model is a classic choice theory.

The main purpose of user behavior is to show different systems to the extent which improves the efficiency of the user to choose between the information clues. There are a lot of literatures on this study [18,19], by summing up the literature, we found there are some common conclusions:

• An information trail for another clue superiority is relative, depending on the user’s information goals;

• For the navigation task in a complex network information environment, the accuracy of information clues disorder will lead to search costs rise.

Woodruff pointed out problem in selecting the appropriate link from a large number of links in the information foraging process, if the number of pages the user visited prior to the landing page, to retrieve the cost representing the cost function N curve, the probability of error link f (f = 0.015, 0.030, ∙∙∙, 0.150) cannot be ruled out, changes in the retrieval cost growth is a linear trend, after a critical point is reached growth shows an exponential trend [18]. Hogg and Huberman, shows a search cost curve, critical point method shows the search result development from linear trend into the exponential trend [20].

In Information foraging theory a lot of similar information clues exists, the user must make a decision to select the availability of the strongest links to continue walking along this link as the link is usually the structure of the tree and the user is also able to predict potential value. Information foraging theory through the extension of Brunswik’s Lens Model (Lens Model) can simulate the feeding of information as how to use the information clues to the development and evaluate feeding strategies. When the information feeding needs are more complex, it is difficult to retrieve the style to articulate all the requirements of a search engine or other search tools which usually retrieves a large number of similar links. Link behind the website content is the information that users really need, so in front of a large number of network links information foraging faces these requirements task, they have not seen a link to the web content behind forecast, this prediction is usually based on these links which are provided looks very similar to the clues (e.g., link title, pictures, etc.) and the information feeding the minds of the target information and experience obtained. Summary of the four theories, you can simulate this process using the following methods: In order to make decisions, an assessment of request with users preference function should be carried out and there should be a selection mechanism for evaluation based on this request. Discussion of the selection mechanism in the cognitive level, in order to simulate user how to use past experiences to evaluate the characteristics of the link behind the content of the invisible, similar clues and pick out the most relevant links. Expanded basic idea of Brunswik’s lens model shown in Figure 1, shows how the user deals with the information clues to make decisions on information foraging strategy.

4. Basic Model of Information Foraging Theory

4.1. Optimal Foraging Theory Model

For the most optimal foraging theory, the present studies are almost based on Stephens and Krebs optimization analysis model [21]. Stephens & Krebs, given the two traditional models:

4.1.1. Plaque Model (Patch Models)

The model in which food resources in the environment was distributed in plaque manner, feeding on those animals face the problem of resources not evenly distributed and how to forage resources, for this animals need to select foraging time and select different feeding end of the food at right time so to turn to find a new feeding point;

4.1.2. Menu Model (Diet Models)

The model to explain problem of different animals in different environments should be feeding on what resources.

In fact, the information resources in the network, there are also some distribution law, information foraging is also facing the problem of allocation of scarce resources such as time and money, as well as the question of feeding on what information resources, so the patch model and the menu model both applies to the network infor-

Figure 1. Brunswik’s modified lens model.

mation environment. Information resources are not as same food resources, information is specified by law such as conservation law [22]. Information foraging theory can be used to improve the conventional optimization analysis model, although there are some idealized assumptions, but it’s a new perspective to analyze and understand the problem.

Pirolli conducted a detailed study of the plaque model of optimal foraging theory and menu model, this view also comes from the results of his study [23].

4.1.2.1 Plaque Model Assume that information foraging in the network information environment is to search of information plaques, such as a website, an essay, a book. Assuming in order to obtain more information, information foraging must also take some time at the end after feeding in an information plaque looking for the next information plaque. When information foraging in a patch, feeding those who face a problem: Continue feeding in the area or looking for the next information plaques to carry out feeding. Therefore there is a turning point, when the information plaque yield (efficiency) access to information is reduced to a certain critical point (the user’s desired point), the user will leave this point to turn to find a new information plaques. Information foraging theory assumes that the total receipts of G is measured, information foraging process is carried out with sole interest that target has been met.

Assume that consumption in the information foraging time can be divided into two parts:

• The time spent between the information plaque T_B;

• Feeding information plaques and digest the information it takes time T_W.

The two variables defined above is in order to express the average efficiency of access to information R by drawing Hollings disc equation fitting method (Holling Disk Equation) [24].

Access to information average efficiency R is expressed as:

Referring to Hollings disc equation fitting, gives some assumptions:

• Find information on the number of patches and the search time is linear;

• The average time it takes to search for new information plaque for t_B;

• The average yield for each information plaque g;

• Average with an information plaque feeding, t_w.

From the average point of view and with the use of Stephens and Charnov model for effective expansion of the plaque [25], the efficiency of the new information plaque can be defined as:

λ = 1/t_B, so you can define the expectations of the total receipts G:

In formula (2), T_B is the search for the number of plaques, as the average gains of each plaque for g, can represent the total receipts.

And so, foraging time spent in the information plaques can be expressed as: the following equation can be introduced based on this:

Equations (1) and (3) can be found, Equation (1) need to know the total time and income Equation (3) uses the average & this average can be obtained through sample experiments in a certain environment; therefore, Equation (3) has good predictability.

Through the above Equation (3), two very important characteristics of information feeding on foraging environment can be derived:

• If information environment contains more information plaques, the average time spent between the information plaque t_B is smaller;

• If certain environmental information plaque contains more valuable information, the average rate of return g rises.

In Equation (3), if the environment contains more information plaques i.e. increase in plaque number λ means that the information encountered in the unit time t_B is lower. If π = g/t_W where information plaques are slightly superior & other conditions unchanged, π in the same circumstances increases the average efficiency of means of access to information.

4.1.2.2. Menu Model In the network environment, faced with the problem of information to select the information foraging in choice of information where range of choices are too narrow, foraging may have to spend a lot of time to build the search query, and the result is relatively one-sided. If the information on the range of choice is too great, the food may get lost in the vast amounts of information. Therefore, select the appropriate menu for information foraging is very important.

Menu model assumes that the information can be divided into various types of feeders, to represent these types, and feeding those who know the density of information in the environment and benefits, with

to represent frequency encountered in the i type of information. For each type of information, g_i () represents the gains they have, states the average time spent by the i type information in the information plaques, the gains π_i of every type of information is derived as,

Information foraging by the menu can be expressed as follows [26]: the user select the type of information collection, when certain information is encountered, the user will be feeding. D stands for foraging by the menu, such as: D = {1,2,3} means the kind of information it contains is 1, 2, 3 menu. Therefore, the average rate of return R can be expressed as:

Pirolli gives an optimal selection algorithm [27]: If it is assumed that the time it takes to recognize the benefits of information is 0, then only proceeds to determine whether the feeders should be included in the menu, improvising the traditional foraging theory model algorithm, you can ask for information foraging menu to select the optimization algorithm, this algorithm can be used to determine the k information to the highest yield.

1) Information to be sorted according to their gain

For simplicity, assume that the sort order is.

2) Food is added to the above order of the menu until the k kinds of food, the average rate of return R(k) is greater than k +1 food, namely:

Initially, the menu D, include only a highest yielding, that is, D = {1}, subsequently, D contains the two highest yielding, ie, D = {1,2}, and so on, to a certain time, i.e. when k + 1 kinds of information does not meet the requirements, when D contains k information, , then this menu D is known as the most optimized menu, the average rate of return R will be reduced in such situations.

The menu selection process are shown in Figure 2.

4.2. The Marginal Value of the Yield Curve

The traditional optimal search theory plaque model epitomized by the Equation (3) this model emphasizes on optimization of allocation of time between patches and

Figure 2. Menu selection process.

plaques. The application of these models has a number of assumptions, information foraging theory assumes that [28]: There are different types of information plaques; expected gains in an information plaque depends on the time spent in the plaques, this time is controlled by information foraging.

The optimization problem focuses on the problem: Before leaving a plaque turned to search for another plaque that should stay long.

The traditional model of plaque can be divided into different types, using.

Information foraging from an information plaque transferred to another information plaque must spend a certain amount of time, when while foraging in an information plaque, user can face with a problem: Should he continue the search for information in information plaque, or should he leave to go looking for a new plaque.

To describe their characteristics for each type of plaque, define the following properties:

• λ_i: Where i is type of plaque frequency in the information environment;

• : Plaque information foraging time of those who stay;

: i information plaques in the revenue function, total time spent in the i-type information plaques.

The total average efficiency of access to information can be expressed by Equation (4):

Gain function curve for information foraging shown in Figure 3, assuming that the search environment is only one type of plaque, income and time spent in the plaque showed a linear relationship, after certain time gains will no longer increase.

If the average time-consuming information between patches t_B is 10, then Information foraging in information plaque to search for information can be divided into three strategies t₁, t₂, t^*.

As shown in Figure 3, the slope of the three dotted lines feeding information on the average efficiency of R₁, R₂, R^*, R^* corresponding to the slope of the maximum value, which means that the selection strategy t^* is optimal.

In fact, the gains and time is not showing a simple linear relationship, therefore there is a need to improve the traditional model so to more accurately evaluate the performance function of gains and time. As mentioned earlier, the network environment is distributed with certain characteristics; therefore it is able to model the characteristics of the entire network.

According to Bhavnani in the experimental results [29] Information foraging literature found by search engines arranged by order of the list, the number of each document contained in the concept of gains, assume that information foraging spent to build search query and search for information is 60 seconds, the time to browse, and deal with each the documents is 10 seconds, Pirolli the experiment gains function is expressed as seconds. We can take advantage of the idea of marginal revenue to calculate the highest average efficiency where t_w the value of access to information. The average efficiency of information:

By the equation solution was t = 97.50 seconds, Therefore, t_w is t − 60 =37.50 seconds, the average efficiency of access to information [30]. Marginal gain curve for Information foraging shown in Figure 4, we can visually access the highest point of the average efficiency of information.

Figure 3. Gain function curve for information foraging.

Figure 4. Marginal gain curve for information foraging.

5. Conclusions

Information foraging theory from the perspective of the relationship between the expected costs and expected return on the foraging behavior of the user’s information system analysis, discovers the impact of the information environment on information foraging behavior. This paper aims to study in detail the information environment and user behavior, to establish appropriate model through the optimization of analytical model to predict the user’s foraging strategy, modeling and prediction of user behavior, foraging behavior to optimize the user information.

At present, research in this area is still deepening, but there still remain many issues for further study. Such as the study of the collective members of the cooperative foraging, and foraging digestive problems (after the feeding of foraging theory assumptions can be completely translate into gains which is obviously not realistic). Future research in depth analysis of information foraging model is required to better simulate the information environment and the user’s information foraging behavior in order to achieve better prediction.

Information foraging theory to analyze the foraging efficiency of user information makes it possible to simulate the user’s information search behavior in the specific network environment, through the average efficiency function to simulate the most effective access to information behavior. Information foraging theory provides a new path of research and analysis methods for information search behavior in the network environment, with a good theoretical discussion of the significance and practical value, it is worth further study.

REFERENCES

K. Annan, Secretary General of the UN Global Knowledge Conference, Canada, 22 June 1997.
P. Pirolli and S. K. Card, “Information Foraging in Information Access Environments,” Proceedings of the CHI’95, ACM Conference on Human Factors in Software, ACM Press, New York, 1995, pp. 51-58. doi:10.1145/223904.223911
S. K. Card, P. Pirolli, M. Van Der Wege, et al., “Information Scent as a Driver of Web Behavior Graphs,” Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 2001, pp. 498-505.
Ed. H. Chi, Pirolli, et al., “Using Information Scent to Model User Information Needs and Actions and the Web,” Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 2001, pp. 490-497.
A. Gonzalez, “Hot on the Scent of Information,” 2008. http://www.wired.com/science/discoveries/news/2001/06/44321
P. Pirolli, “The Use of Proximal Information Scent to Forage for Distal Content on the World Wide Web,” Adaptive Perspective on Human-Technology Interaction: Methods & Models for Cognitive Engineering and Human-Computer Interaction, University Press, Oxford, pp. 247-266.
P. Pirolli, “Information Foraging Theory,” Oxford University Press, New York, 2007, pp. 31-35. doi:10.1093/acprof:oso/9780195173321.001.0001
R. Baeza-Yates and B. Ribeiro-Neto, “Modern Information Retrieval,” Mechanical Industry Press, Beijing, 2005.
P. Pirolli, “Information Foraging Theory,” Oxford University Press, New York, 2007, pp. 35-39. doi:10.1093/acprof:oso/9780195173321.001.0001
S. K. Card, P. Pirolli, M. Van Der Wege, et al., “Information Scent as a Driver of Web Behavior Graphs: Results of a Protocol Analysis Method for Web Usability,” ACM Conference on Human Factors in Computing Systerms, CHI Letters, Vol. 3, No. 1, 2001, pp. 498-505.
S. Lambros, “Investigating the Applicability of Information Foraging Theory to Mobile Web Browsing,” Virginia Polytechnic Institute and State University School of Computer Science and Applications, Virginia, 2005.
S. S. Sundar, S. Knobloch-Westerwick and M. R. Hastall, “News Cues: Information Scent and Cognitive Heuristics,” Journal of the American Society for Information Science and Technology, Vol. 58, No. 3, 2007, pp. 366- 378.
J. H. Barkow, L. Cosmides and J. Tooby, “The Adapted Mind: Evolutionary Psychology and the Generation of Culture,” Oxford University Press, Oxford, 1992.
E. Brunswik, “Perception and the Representative Design of Psychological Experiments,” University of California Press, Berkeley, 1956.
J. R. Anderson, “The Adaptive Nature of Human Categorization,” Psychological Review, Vol. 98, No. 3, 1991, pp. 409- 429. doi:10.1037/0033-295X.98.3.409
J. R. Anderson, “The Adaptive Character of Thought,” Routledge, Erlbaum, 1990.
D. McFadden, “Modelling the Choice of Residential Location,” In: A. Karlqvist, F. Snickars and J. Weibull, Eds., Spatial Interaction Theory and Planning Models, North Holland, 1978, pp. 75-96.
A. Woodruff, R. Rosenholtz, J. B. Morrison, A. Faulring and P. Pirolli, “A Comparison of the Use of Text Summaries, Plain Thumbnails, and Enhanced Thumbnails for Web Search Tasks,” Journal of the American Society for Information Science and Technology, Vol. 53, No. 2, 2002, pp. 172-185. doi:10.1002/asi.10029
P. Pirolli and W.-T. Fu, “SNIF-ACT: A Model of Information Foraging on the World Wide Web,” Proceedings of the 9th International Conference on User Modeling, Vol. 2702, 2003, pp. 45-54.
T. Hogg and B. A. Huberman, “Artificial Intelligence and Large Scale Computation: A Physics Perspective,” Physics Reports, Vol. 156, No. 5, 1987, pp. 227-310. doi:10.1016/0370-1573(87)90096-2
D. W. Stephens and J. R. Krebs, “Foraging Theory,” Princeton University Press, Princeton, 1986.
“Information Management Foundation,” Wuhan University Press, Wuhan, 2002.
P. L. T. Pirolli, “Information Foraging Theory,” Oxford University Press, Oxford, 2007, pp. 30-46.
C. S. Holling, “Some Characteristics of Simple Types of Predation and Parasitism,” The Canadian Entomologist, Vol. 91, No. 7, 1959, pp. 385-398. doi:10.4039/Ent91385-7
D. W. Stephens, “Charnov, Eric, Optimal foraging: Some Simple Stochastic Models,” Behavioral Ecology and Sociobiology, Vol. 10, No. 4, 1982, pp. 251-253. doi:10.1007/BF00302814
P. L. T. Pirolli, “Information Foraging Theory,” Oxford University Press, Oxford, 2007, p. 33.
P. Pirolli and S. Card, “Information Foraging in Information Access Environments,” Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 2005, pp. 51-58.
P. L. T. Pirolli, “Information Foraging Theory,” Oxford University Press, Oxford, 2007, p. 35.
S. K. Bhavnani, R. T. Jacob, J. Nardine and F. A. Peck, “Exploring the Distribution of Online Healthcare Information,” Computer Human Interaction “CHI’03” Extended Abstracts on Human Factors in Computing Systems, 2003, pp. 816-817.
P. L. T. Pirolli, “Information Foraging Theory,” Oxford University Press, Oxford, 2007, p. 38.

Journal Menu >>