Paper Menu >>
Journal Menu >>
![]() J. Software Engineering & Applications, 2008, 1: 26-32 Published Online December 2008 in SciRes (www.SciRP.org/journal/jsea) Copyright © 2008 SciRes JSEA Storing and Searching Metadata for Digital Broadcasting on Set-Top Box Environments Jong-Hyun Park 1 , Ji-Hoon Kang 2 1 Software Research Center, Chungnam National University, Gung-Dong, Yuseong-Gu, Daejeon, 305-764, South Korea, 2 Dept. of Computer Science and Engineering, Chungnam National University, Gung-Dong, Yuseong-Gu, Daejeon, 305-764, South Korea Email: { 1 jonghyunpark, 2 jhkang}@cnu.ac.kr Received October 30 th , 2008; revised November 10 th , 2008; accepted November 14 th , 2008. ABSTRACT Digital broadcasting is a novel paradigm for the next generation broadcasting. Its goal is to provide not only better quality of pictures but also a variety of services that is impossible in traditional airwaves broadcasting. One of the im- portant factors for this new broadcasting environment is the interoperability among broadcasting applications since the environment is distributed. Therefore the broadcasting metadata becomes increasingly important and one of the meta- data standards for a digital broadcasting is TV-Anytime metadata. TV-Anytime metadata is defined using XML schema, so its instances are XML data. In order to fulfill interoperability, a standard query language is also required and XQuery is a natural choice. There are some researches for dealing with broadcasting metadata. In our previous study, we have proposed the method for efficiently managing the broadcasting metadata in a service provider. However, the environment of a Set-Top Box for digital broadcasting is limited such as low-cost and low-setting. Therefore there are some considerations to apply general approaches for managing the metadata into the Set-Top Box. This paper pro- poses a method for efficiently managing the broadcasting metadata based on the Set-Top Box and a prototype of meta- data management system for evaluating our method. Our system consists of a storage engine to store the metadata and an XQuery engine to search the stored metadata and uses special index for storing and searching. Our two engines are designed independently with hardware platform therefore these engines can be used in any low-cost applications to manage broadcasting metadata. Keywords: Digital Broadcasting, Metadata Management, Storing and Searching XML Data, XQuery Processing, TV- Anytime metadata, Set-Top Box 1. Introduction Digital broadcasting is a novel paradigm for the next gen- eration broadcasting. Its goal is to provide not only better quality of pictures but also a variety of services that is impossible in traditional airwaves broadcasting [1]. One of the important factors for this new broadcasting envi- ronment is the interoperability among applications since the environment is distributed. As the digital broadcasting is evolving to more complex and diverse environment due to rapid increase of channels and content, the broadcasting metadata becomes increasingly important. Therefore a standard metadata for digital broadcasting is required and TV-Anytime metadata [2] that is proposed by the TV- Anytime Forum is one of the metadata standards for digi- tal broadcasting [3]. A Set-Top Box, which is called personal digital record- ers (PDR), is responsible for receiving and managing the digital content and its metadata. Currently, a Set-Top Box is designed with limited hardware and relatively software. Therefore, it is necessary to develop technologies for effi- ciently storing of metadata and searching stored metadata based on The Set-Top Box with low-costing and low- setting. Of course, several researches have already pro- posed some methods for managing metadata on digital broadcasting environment for these necessaries [4]. How- ever, we cannot confirm whether their methods run effi- ciently in a Set-Top box environment because they do not consider characteristics of a Set-Top Box. We have also proposed the method for efficiently managing the broad- casting metadata in a service provider before this study [4]. The result of our research was more effective than other methods. However, to apply our previous methods into Set-Top Box has several problems such as small storage, memory size, and limited software. Consequently, there are some issues to apply general approaches for managing the metadata into Set-Top Box and we have to consider these issues. In this paper, we propose a method for storing and searching broadcasting metadata. Also we implement the prototype using the proposed method and evaluate our method on a Set-Top Box environment with low-cost and low-setting. 2 He is a corresponding author ![]() Storing and Searching Metadata for Digital Broadcasting on Set-Top Box Environments 27 Copyright © 2008 SciRes JSEA <TVAMain version="1"> <ProgramDescription> <ProgramInformationTable> <ProgramInformation programId="crid://www.kbs.com/KBSNews9103052300001"> <BasicDescription> <Title type="main">KBS News 9</Title> <Synopsis> Bank of Korea Cuts Key Rate, Kim Yu-na Captures Skate America Title </Synopsis> <Keyword> Main News </Keyword> <Keyword> Night News </Keyword> </BasicDescription> </ProgramInformation> </ProgramInformationTable> <ProgramLocationTable> <OnDemandProgram> <Program crid="crid://www.kbs.com/KBSNews9103052300001"></Program> <ProgramURL>D:\Media\News\news_9.mpg</ProgramURL> </OnDemandProgram> </ProgramLocationTable> <SegmentInformationTable timeUnit="PT1001N30000F"> <SegmentList> <SegmentInformation segmentId="SID_0_0_148"> . . . </SegmentInformation> . . . </SegmentList> </SegmentInformationTable> </ProgramDescription> </TVAMain> The remainder of this paper is organized as follows. Section 2 describes the related work. Section 3 and sec- tion 4 shows the index for Broadcasting metadata and a method for storing and searching metadata by our proto- type system, respectively. In section 5, we describe the conformance evaluation and finally, section 6 provides concluding remarks. 2. Related Work TV-Anytime forum is organized to develop specifications to enable services based on Local Storage and TV- Anytime Metadata is one of these specifications. TV- Anytime Metadata is used to describe various TV contents and is identified by CRID (Content Reference Identifier). The metadata allows consumer to find, navigate and man- age content from a variety of sources, for example, broad- cast, TV, internet. XML is the “representation format” used to define the schemas of the TV-Anytime Specifica- tion. Also, TV-Anytime metadata is technically defined using a single XML schema, so it is comprised of XML data. Figure 1 shows the structure of TV-Anytime meta- data and Figure 2 is its sample instance. TV-Anytime metadata is technically defined using sin- gle XML schema, and it's comprised of XML data. There- fore the method for storing and searching TV-Anytime metadata relates with the method for XML data. Many researchers have investigated different ways of storing XML data in relational databases [4,5,6,7], native XML databases [8,9], and file systems [10,11]. Some re searches including our previous research investigated methods for storing the broadcasting metadata into relational database and searching stored metadata [4,5]. [4,5] support both XPath and XQuery languages for searching. So, two systems Figure 1. The structure of TV-Anytime Metadata Figure 2. A sample instance of TV-Anytime Metadata have a module to convert from user query to SQL query and use a specialized indexing method for efficient searching (quick processing of selection, projection, and join). However these two systems use a commercial rela- tional database management system to manage the large volume of metadata because they only focused on service provider systems. Of course, it seems that it is a natural choice to use the RDBMS or Native XML DB because the content service provider has to mange not only the large volume of broadcasting metadata but also a lot of multi- media contents. However, their cost is expensive for STB with low-cost and low-setting. 3. Index for Broadcasting Metadata In order to store broadcasting metadata, we select the file system because of the cost and hardware power. Although we choose the file system, the basic idea for storing is similar to our previous approach for storing TV-anytime metadata into a relational database. In other words, the basic approach for storing is based on binary approach [12] and the approach for assigning an identifier into a node is the Dewey number labeling [13] to keep a parent- child relationship. Also we use the path table concept [14] for direct ac- cessing to every nodes and node position concept for ob- taining partial document from the metadata instance. Every node which has same name is stored in a single file and information for searching is addressed by the index file. Figure 3 shows the structure for indexing a ‘b’ node. ![]() 28 Storing and Searching Metadata for Digital Broadcasting on Set-Top Box Environments Copyright © 2008 SciRes JSEA Figure 3. The structure of index The structure for indexing a node consists of node in- formation part, common ID part, and node values part. In the node information part, we store the name, ID, and po- sition which address the position of current node in origi- nal TV-Anytime metadata instance. The common ID part includes the name and ID of TV-Anytime metadata in- stance and Path ID which links with the XPath expression from root node to current node. The node value parts stores the information of child nodes and attribute nodes. Figure 4 shows an example XQuery query, Path Index, Node Index and document tree for obtaining result of the query briefly. In order to process the example XQuery query, a node has to satisfy following conditions. The full path expression to‘d’ node from root node is ‘a/b/c/d’, and its value have to contain “KBS News 9”. Also the parent node ‘c’ of the ’d’ node must have ‘Month’ attribute and its value have to equal to ‘May’. If a node satisfies these conditions, we can obtain the partial documents of TV- Anytime metadata instance including the node by the Node_Position. 4. Metadata Management System for Storing and Searching The goal of Metadata Management System is to store and search metadata efficiently in a Set-Top Box environment for digital broadcasting. Figure 5 shows the architecture and function of the metadata management system in the Set-Top Box. Our metadata management system consists of the Storage Engine and the XQuery Engine. As shown in Figure 6, Storage Engine provides basi- cally four interfaces: InsertDoc, DeleteDoc, UpdateDoc, and GetDoc for inserting, deleting, updating, and retriev- ing a metadata instance, respectively. In order to generate and store an index file including a metadata, InsertDoc parses the metadata received from Metadata Generator or Metadata Editor and then extracts and stores the informa- tion from the parsing Tree. DeleteDoc deletes the meta- data matched with the user-inputted CRID. UpdateDoc deletes the old metadata that has the same CRID as the new metadata, and then inserts the new metadata. Since XQuery doesn't support update of XML data, we use the delete and insert instead of update command. In this paper, we propose to use XQuery as query lan- guage for searching the broadcasting metadata. Since XQuery is standard query language proposed by W3C for querying XML data, it guarantees interoperability be- tween digital broadcasting applications including a Set- Top Box. An XQuery Engine consists of an XQuery parser module for query validation and a SearchDoc mod- ule for query execution. The input of XQuery Engine is the XQuery query, and its output is either the whole document or one part of the document. Figure 7 shows the architec- ture of XQuery Engine for a search of stored metadata. (1) Input XQuery query For $d in input (“TVAnytime”) For $p1 in $d/a/b/c For $p2 in $p1/d Where contains (string($p2), “KBS News9”) and $p1/ @ Month=“May” return <returns> {$p2} </returns> (2) Path Index (3) Each Node Index ![]() Storing and Searching Metadata for Digital Broadcasting on Set-Top Box Environments 29 Copyright © 2008 SciRes JSEA (4) Document Tree (5) Result Composer <returns>KBS News 9</returns> Figure 4. An example for processing a XQuery query Figure 5. The architecture of TV-anytime metadata man- agement system Figure 6. The architecture of storage engine XQuery Analyzer gets a query in XQuery, parses the query using an XQuery parser and generates its syntax tree. XPath Translator module creates an XPath expres- Figure 7. The architecture of XQuery engine sion which consist of full path expression to current node from root node by merging XPath expressions defined in FOR and LET clauses in XQuery queries. WHERE Proc- essor and RETURN Processor are used for processing conditions defined in a WHERE clause and for construct- ing the result structure defined in RETURN clause, re- spectively. Index Analyzer parses the index files and gen- erates the information for obtaining result metadata frag- ments from the storage by using the selected index. Result Composer constructs the final result using the result struc- ture and result metadata fragments. 5. Performance Evaluation In order to evaluate whether our choice of the strategies for the issues is relevant, we compare our prototype sys- tem with other general-purpose XQuery Engine and test their performance for various typical queries. We select two popular general XQuery Engines. One is the Oracle XQuery Engine [10]. The other is a Saxon-B XQuery En- gine [11]. Two XQuery Engine it is all free source, a JAVA base, and a head of a family general XQuery En- gine. The experimental setup is as follow: the CPU is Intel Pentium III Process 750 MHz, the memory size is 256 MB, the JDK version is 1.4 and the OS is LINUX 2.4.2. Our system uses XQuery, which is a sub set of XQuery 1.0 (e.g. is not support ‘OR’ in WHERE clause and ‘//’ in XPath path express). From the previous work [4, 15, 16, 17], we have found that the query processing performance depend on the XPath expression, number of predicates, and result size. By considering these factors, I use the XQuery in Table 1. We omit some expressions in example queries except Q1. For example, the constructor ‘<Results>’ is omitted because that is the same as in Q1. The queries Q1, Q2 and Q3 use single condition which is declared in the WHERE TV-Anytime Metadata Metadata Generator Metadata Editor InsertDoc DeleteDoc Storage Engine UpdateDoc GetDoc I n dex ![]() 30 Storing and Searching Metadata for Digital Broadcasting on Set-Top Box Environments Copyright © 2008 SciRes JSEA Table 1. Example XQuery queries for experiment WHERE Conditions / RETURN Value XQuery query Single condition/ Single terminal node Q1 <Results>{ for $d in input("TVAnytime") return <Result>{ for $p1 in $d/TVAMain/ProgramDescription/ ProgramInformationTable/ProgramInformation for $p2 in $p1/@programId for $p3 in $p1/BasicDescription/Title where $p2="crid://www.kids17.net/amigonme 103042200049" return <node>{ $p3 }</node> }</Result> }</Results> Single condition/ Single root node Q2 for $p1 in $d/TVAMain for $p2 in $p1/ProgramDescription/ProgramInformation Table/ProgramInformation/@programId where $p2="crid://www.kids17.net/amigonme 103042200049" return <node>{ $p1 }</node> Single condition/ Multiple terminal nodes Q3 for $p1 in $d/TVAMain for $p2 in $p1/ProgramDescription/ProgramInformationTable/ ProgramInformation/BasicDescription/Genre/Name where $p2="Education" return <node>{ $p1 }</node> Three conditions/ Single root node Q4 for $p1 in $d/TVAMain for $p2 in $p1/ProgramDescription/ProgramInformation Table/ProgramInformation/BasicDescription for $p3 in $p2/Language for $p4 in $p2/ProductionDate/TimePoint for $p5 in $p2/ReleaseInformation/ReleaseDate/ DayAndYear where $p3="ko" and $p4>="2006" and $p5="2006-04-14" return <node>{ $p1 }</node> Five conditions/ Multiple root nodes Q5 for $p1 in $d/TVAMain for $p2 in $p1/ProgramDescription/ProgramInformation Table/ProgramInformation/BasicDescription for $p3 in $p2/Language for $p4 in $p2/ProductionDate/TimePoint for $p5 in $p2/ReleaseInformation/ReleaseDate/ DayAndYear for $p6 in $p1/ProgramDescription/ProgramLocation Table/BroadcastEvent/Live/@value for $p7 in $p1/ProgramDescription/ServiceInformation Table/ServiceInformation/Name where $p3="ko" and $p4>="2006" and $p5="2006-04-14" and $p6="true" and $p7="KBS" return <node>{ $p1 }</node> Three conditions/ Multiple terminal & root nodes Q6 for $p1 in $d/TVAMain for $p2 in $p1/ProgramDescription/ProgramInformation Table/ProgramInformation/BasicDescription for $p3 in $p2/Title for $p4 in $p2/Language for $p5 in $p2/ProductionDate/TimePoint for $p6 in $p2/ReleaseInformation/ReleaseDate/ DayAndYear for $p7 in $p1/ProgramDescription/ProgramLocation Table/BroadcastEvent/Live/@value for $p8 in $p1/ProgramDescription/ServiceInformation Table/ServiceInformation/Owner for $p9 in $p1/ProgramDescription/CreditsInformation Table/PersonName for $p10 in $p1/ProgramDescription/ServiceInformation Table/ServiceInformation/ParentService where $p3="KBS News 9" and $p4="ko" and $p5>="2006" and $p7="true" and $p8="KBS" return <node>{ $p3, $p4, $p5, $p6, $p9, $p10 }</node> Figure 8. Comparison of query processing times clause. However, the result data sizes are expected differ- ent because the result of each query is a leaf node, an root node, and multiple root nodes together with their descen- dent nodes, respectively. Q4, Q5 and Q6 use different number of conditions. The return value of each query is a single root node, multiple root nodes, and multiple termi- nal and root nodes, respectively. Figure 8 summarizes the performance. The numbers of the test data are 50 and 200 TV-Anytime metadata in- stances respectively. The result shows that our system outperforms other methods for any queries except Q6. In case of Saxon B and Oracle, the complex queries Q4 and Q5, takes more execution time than simple query Q1, Q2, and Q3. However, our system does not so depend on the queries. In case of our system, Q6 takes more execution time than the other queries since we need time to compose result. However the case of Q6 is not general, because the result size of user queries is not large volume in a Set-Top Box, generally. Figure 9 summarizes the scalability property of the sys- tems. The size of the test data is 50 documents, 100 documents, 150 documents and 200 documents, respec- tively. In case of Saxon B and Oracle, the processing time increases linearly as the number of data increases. How- ever, the processing time of our system is independent of the data size for searching. The result of the evaluation shows that our system outperforms so that our approach is believed to be on of the efficient approaches for managing metadata in the Set-Top Box. 50 documents Saxon B Oracle Our System 14,000 12,000 10,000 8,000 6,000 4,000 2,000 0 Time (milliseconds) Q1 Q2 Q3 Q4 Q5 Q6 Queries Q1 Q2 Q3 Q4 Q5 Q6 Queries 200 documents 14,000 12,000 10,000 8,000 6,000 4,000 2,000 0 Time (milliseconds) Saxon B Oracle Our System ![]() Storing and Searching Metadata for Digital Broadcasting on Set-Top Box Environments 31 Copyright © 2008 SciRes JSEA Figure 9. Performance evaluation for scalability property 6. Conclusions In this paper, we have proposed a method for storing and searching TV-Anytime metadata for digital broadcasting based on a Set-Top Box which is low-cost and low-setting. Also we have implemented a prototype system for apply- ing our method and evaluated our approach which seems important since our prototype system outperforms the other compared systems. Our system was developed on digital broadcast environments [18]. However our result can be applied to any XML management systems that fo- cus on the performance of store and retrieval on low-cost environments. 7. Acknowledgement This research is supported by MKE & IITA(08- Infrastructure-13, Ubiquitous Technology Research Cen- ter), and also by Foundation of ubiquitous computing and networking project (UCN) Project, the Ministry of Knowledge Economy (MKE) 21st Century Frontier R&D Program in Korea and a result of subproject UCN 08B3-O1- 30S. 14,000 12,000 10,000 8,000 6,000 4,000 2,000 0 Time (milliseconds) Q 3 Q 4 Saxon B Oracle Our System Saxon B Oracle Our System 14,000 12,000 10,000 8,000 6,000 4,000 2,000 0 Time (milliseconds) 50 100 150 200 Number of documents 50 100 150 200 Number of documents 50 100 150 200 Number of documents 50 100 150 200 Number of documents Saxon B Oracle Our System Saxon B Oracle Our System Q 5 Q 6 14,000 12,000 10,000 8,000 6,000 4,000 2,000 0 Time (milliseconds) 14,000 12,000 10,000 8,000 6,000 4,000 2,000 0 Time (milliseconds) Saxon B Oracle Our System 14,000 12,000 10,000 8,000 6,000 4,000 2,000 0 Time (milliseconds) 50 100 150 200 Number of documents Q1 Q2 Saxon B Oracle Our System 14,000 12,000 10,000 8,000 6,000 4,000 2,000 0 Time (milliseconds) 50 100 150 200 Number of documents ![]() 32 Storing and Searching Metadata for Digital Broadcasting on Set-Top Box Environments Copyright © 2008 SciRes JSEA REFERENCES [1] S. Pfeiffer and U. Srinivasan, “TV anytime as an applica- tion scenario for MPEG-7,” In Proceedings ACM Multi- media 2000, Los Angeles, October 2000. [2] “TV-anytime phase 1,” Part 3 Metadata, ETSI TS 102 822-3-1, Vol. 1.1.1, October 2003. [3] TV-Anytime Forum Website: http://www.tv-anytime.org. [4] J. H. Park, J. H. Kang, B. K. Kim, Y. H. Lee, M. W. Lee and M. O. Jung, “An XQuery-based TV-anytime metadata management,” Proceedings of DASFAA’05 Conference, April 2005. [5] H. S. Shin, “A storage and retrieval method of XML- based metadata in PVR environment,” IEEE Transactions on Consumer Electronics, Vol. 49, No. 4, pp. 1136-1140, November 2003. [6] D. Florescu and D. Kossmann, “Storing and querying xml data using an RDBMS,” IEEE Data Engineering Bulletin, Vol. 22, No. 3, 1999. [7] I. Tatatinov, S. D. Viglas, K. Beyer, J. Shanmugasunda- ram, E. Shekita, and C. Zhang, “Storing and Querying or- dered XML using a relational database system”, Proceed- ings of ACM SIGMOD Conference, June 2002. [8] ORACLE, “Berkeley DB introduction,” http://www.oracle. com/database/berkeley-db/. [9] T. Fiebig, S. Helmer, C. C. Kanne, J. Mildenberger, G. Moerkotte, R. Schiele, and T. Westmann, “Anatomy of a Native XML Base Management System,” Technical Re- port 01, University of Mannheim, 2002. [10] ORACLE, “Oracle XML data synthesis or XDS,” http://www.oracle.com/technology/tech/xml/xds/. [11] SAXONICA, “SAXON XQuery Engine,” http://www. saxonica.com/. [12] D. Florescu and D. Kossmann, “Storing and querying XML data using an RDBMS,” IEEE Data Engineering Bulletin, Vol. 22, No. 3, pp. 27-34, September 1999. [13] I. Tatarinov, S. D. Viglas, K. Beyer, J. Shanmugasunda- ram, E. Shekita, and C. Zhang, “Storing and querying ordered xml using a RDB system,” Proceedings ACM SIGMOD Con- ference, June 2002. [14] M. Yoshikawa, T. Amagasa, T. Shimura, and S. Uemura: “XRel: a path-based approach to storage and retrieval of XML documents using RDBs,” Proceedings ACM Trans- actions on Internet Technology, Vol. 5, August 2001. [15] T. Grust, “Accelerating XPath location steps,” Proceedings of the ACM SIGMOD Confefence, pp.109-120, June 2006. [16] M. Barg and R. K. Wong, “A fast and versatile path index for querying semi-structured data,” Proceedings of the DASSFAA’03 Conference, pp. 249-256, March 2003. [17] S. Hidaka, H. Kato and M. Yoshikawa, “A relative cost model for XQuery,” Proceedings of the SAC’07 Confer- ence, March 2007. [18] K. Kang, J. G. Kim, H. K. Lee, H. S. Chang, S. J. Yang, Y. T. Kim, H. K. Lee, and J. W. Kim, “Metadata broadcasting for personalized service: A practical solution,” ETRI Jour- nal, Vol. 26, No. 5, pp. 452-466, October 2004. |