Storing and Searching Metadata for Digital Broadcasting on Set-Top Box Environments

doi:10.4236/jsea.2008.11005

Paper Menu >>

Journal Menu >>

J. Software Engineering & Applications, 2008, 1: 26-32

Published Online December 2008 in SciRes (www.SciRP.org/journal/jsea)

Storing and Searching Metadata for Digital Broadcasting

on Set-Top Box Environments

Jong-Hyun Park

, Ji-Hoon Kang

Software Research Center, Chungnam National University, Gung-Dong, Yuseong-Gu, Daejeon, 305-764, South Korea,

Dept. of

Computer Science and Engineering, Chungnam National University, Gung-Dong, Yuseong-Gu, Daejeon, 305-764, South Korea

Email: {

jonghyunpark,

jhkang}@cnu.ac.kr

Received October 30

, 2008; revised November 10

, 2008; accepted November 14

, 2008.

ABSTRACT

Digital broadcasting is a novel paradigm for the next generation broadcasting. Its goal is to provide not only better

quality of pictures but also a variety of services that is impossible in traditional airwaves broadcasting. One of the im-

portant factors for this new broadcasting environment is the interoperability among broadcasting applications since the

environment is distributed. Therefore the broadcasting metadata becomes increasingly important and one of the meta-

data standards for a digital broadcasting is TV-Anytime metadata. TV-Anytime metadata is defined using XML schema,

so its instances are XML data. In order to fulfill interoperability, a standard query language is also required and

XQuery is a natural choice. There are some researches for dealing with broadcasting metadata. In our previous study,

we have proposed the method for efficiently managing the broadcasting metadata in a service provider. However, the

environment of a Set-Top Box for digital broadcasting is limited such as low-cost and low-setting. Therefore there are

some considerations to apply general approaches for managing the metadata into the Set-Top Box. This paper pro-

poses a method for efficiently managing the broadcasting metadata based on the Set-Top Box and a prototype of meta-

data management system for evaluating our method. Our system consists of a storage engine to store the metadata and

an XQuery engine to search the stored metadata and uses special index for storing and searching. Our two engines are

designed independently with hardware platform therefore these engines can be used in any low-cost applications to

manage broadcasting metadata.

Keywords:

Digital Broadcasting, Metadata Management, Storing and Searching XML Data, XQuery Processing, TV-

Anytime metadata, Set-Top Box

1. Introduction

Digital broadcasting is a novel paradigm for the next gen-

eration broadcasting. Its goal is to provide not only better

quality of pictures but also a variety of services that is

impossible in traditional airwaves broadcasting [1]. One

of the important factors for this new broadcasting envi-

ronment is the interoperability among applications since

the environment is distributed. As the digital broadcasting

is evolving to more complex and diverse environment due

to rapid increase of channels and content, the broadcasting

metadata becomes increasingly important. Therefore a

standard metadata for digital broadcasting is required and

TV-Anytime metadata [2] that is proposed by the TV-

Anytime Forum is one of the metadata standards for digi-

tal broadcasting [3].

A Set-Top Box, which is called personal digital record-

ers (PDR), is responsible for receiving and managing the

digital content and its metadata. Currently, a Set-Top Box

is designed with limited hardware and relatively software.

Therefore, it is necessary to develop technologies for effi-

ciently storing of metadata and searching stored metadata

based on The Set-Top Box with low-costing and low-

setting. Of course, several researches have already pro-

posed some methods for managing metadata on digital

broadcasting environment for these necessaries [4]. How-

ever, we cannot confirm whether their methods run effi-

ciently in a Set-Top box environment because they do not

consider characteristics of a Set-Top Box. We have also

proposed the method for efficiently managing the broad-

casting metadata in a service provider before this study [4].

The result of our research was more effective than other

methods. However, to apply our previous methods into

Set-Top Box has several problems such as small storage,

memory size, and limited software. Consequently, there

are some issues to apply general approaches for managing

the metadata into Set-Top Box and we have to consider

these issues.

In this paper, we propose a method for storing and

searching broadcasting metadata. Also we implement the

prototype using the proposed method and evaluate our

method on a Set-Top Box environment with low-cost and

low-setting.

He is a corresponding author

Storing and Searching Metadata for Digital Broadcasting on Set-Top Box Environments

<ProgramInformation

programId="crid://www.kbs.com/KBSNews9103052300001">

<Synopsis> Bank of Korea Cuts Key Rate, Kim Yu-na Captures

Skate America Title </Synopsis>

<Keyword> Night News </Keyword>

</BasicDescription>

</ProgramInformation>

</ProgramInformationTable>

<Program

crid="crid://www.kbs.com/KBSNews9103052300001"></Program>

<ProgramURL>D:\Media\News\news_9.mpg</ProgramURL>

</OnDemandProgram>

</ProgramLocationTable>

<SegmentInformation segmentId="SID_0_0_148"> . . .

</SegmentInformation> . . .

</SegmentList>

</SegmentInformationTable>

</ProgramDescription>

</TVAMain>

The remainder of this paper is organized as follows.

Section 2 describes the related work. Section 3 and sec-

tion 4 shows the index for Broadcasting metadata and a

method for storing and searching metadata by our proto-

type system, respectively. In section 5, we describe the

conformance evaluation and finally, section 6 provides

concluding remarks.

2. Related Work

TV-Anytime forum is organized to develop specifications

to enable services based on Local Storage and TV-

Anytime Metadata is one of these specifications. TV-

Anytime Metadata is used to describe various TV contents

and is identified by CRID (Content Reference Identifier).

The metadata allows consumer to find, navigate and man-

age content from a variety of sources, for example, broad-

cast, TV, internet. XML is the “representation format”

used to define the schemas of the TV-Anytime Specifica-

tion. Also, TV-Anytime metadata is technically defined

using a single XML schema, so it is comprised of XML

data. Figure 1 shows the structure of TV-Anytime meta-

data and Figure 2 is its sample instance.

TV-Anytime metadata is technically defined using sin-

gle XML schema, and it's comprised of XML data. There-

fore the method for storing and searching TV-Anytime

metadata relates with the method for XML data. Many

researchers have investigated different ways of storing

XML data in relational databases [4,5,6,7], native XML

databases [8,9], and file systems [10,11]. Some re searches

including our previous research investigated methods for

storing the broadcasting metadata into relational database

and searching stored metadata [4,5]. [4,5] support both

XPath and XQuery languages for searching. So, two systems

Figure 1. The structure of TV-Anytime Metadata

Figure 2. A sample instance of TV-Anytime Metadata

have a module to convert from user query to SQL query

and use a specialized indexing method for efficient

searching (quick processing of selection, projection, and

join). However these two systems use a commercial rela-

tional database management system to manage the large

volume of metadata because they only focused on service

provider systems. Of course, it seems that it is a natural

choice to use the RDBMS or Native XML DB because the

content service provider has to mange not only the large

volume of broadcasting metadata but also a lot of multi-

media contents. However, their cost is expensive for STB

with low-cost and low-setting.

3. Index for Broadcasting Metadata

In order to store broadcasting metadata, we select the file

system because of the cost and hardware power. Although

we choose the file system, the basic idea for storing is

similar to our previous approach for storing TV-anytime

metadata into a relational database. In other words, the

basic approach for storing is based on binary approach

[12] and the approach for assigning an identifier into a

node is the Dewey number labeling [13] to keep a parent-

child relationship.

Also we use the path table concept [14] for direct ac-

cessing to every nodes and node position concept for ob-

taining partial document from the metadata instance.

Every node which has same name is stored in a single file

and information for searching is addressed by the index

file. Figure 3 shows the structure for indexing a ‘b’ node.

Storing and Searching Metadata for Digital Broadcasting on Set-Top Box Environments

Figure 3. The structure of index

The structure for indexing a node consists of node in-

formation part, common ID part, and node values part. In

the node information part, we store the name, ID, and po-

sition which address the position of current node in origi-

nal TV-Anytime metadata instance. The common ID part

includes the name and ID of TV-Anytime metadata in-

stance and Path ID which links with the XPath expression

from root node to current node. The node value parts

stores the information of child nodes and attribute nodes.

Figure 4 shows an example XQuery query, Path Index,

Node Index and document tree for obtaining result of the

query briefly. In order to process the example XQuery

query, a node has to satisfy following conditions. The full

path expression to‘d’ node from root node is ‘a/b/c/d’, and

its value have to contain “KBS News 9”. Also the parent

node ‘c’ of the ’d’ node must have ‘Month’ attribute and

its value have to equal to ‘May’. If a node satisfies these

conditions, we can obtain the partial documents of TV-

Anytime metadata instance including the node by the

Node_Position.

4. Metadata Management System for Storing

and Searching

The goal of Metadata Management System is to store and

search metadata efficiently in a Set-Top Box environment

for digital broadcasting. Figure 5 shows the architecture

and function of the metadata management system in the

Set-Top Box. Our metadata management system consists

of the Storage Engine and the XQuery Engine.

As shown in Figure 6, Storage Engine provides basi-

cally four interfaces: InsertDoc, DeleteDoc, UpdateDoc,

and GetDoc for inserting, deleting, updating, and retriev-

ing a metadata instance, respectively. In order to generate

and store an index file including a metadata, InsertDoc

parses the metadata received from Metadata Generator or

Metadata Editor and then extracts and stores the informa-

tion from the parsing Tree. DeleteDoc deletes the meta-

data matched with the user-inputted CRID. UpdateDoc

deletes the old metadata that has the same CRID as the

new metadata, and then inserts the new metadata. Since

XQuery doesn't support update of XML data, we use the

delete and insert instead of update command.

In this paper, we propose to use XQuery as query lan-

guage for searching the broadcasting metadata. Since

XQuery is standard query language proposed by W3C for

querying XML data, it guarantees interoperability be-

tween digital broadcasting applications including a Set-

Top Box. An XQuery Engine consists of an XQuery

parser module for query validation and a SearchDoc mod-

ule for query execution. The input of XQuery Engine is the

XQuery query, and its output is either the whole document

or one part of the document. Figure 7 shows the architec-

ture of XQuery Engine for a search of stored metadata.

(1) Input XQuery query

For $d in input (“TVAnytime”)

For $p1 in $d/a/b/c

For $p2 in $p1/d

Where contains (string($p2), “KBS News9”) and $p1/ @

Month=“May”

return <returns> {$p2} </returns>

(2) Path Index

(3) Each Node Index

Storing and Searching Metadata for Digital Broadcasting on Set-Top Box Environments

(4) Document Tree

(5) Result Composer

Figure 4. An example for processing a XQuery query

Figure 5. The architecture of TV-anytime metadata man-

agement system

Figure 6. The architecture of storage engine

XQuery Analyzer gets a query in XQuery, parses the

query using an XQuery parser and generates its syntax

tree. XPath Translator module creates an XPath expres-

Figure 7. The architecture of XQuery engine

sion which consist of full path expression to current node

from root node by merging XPath expressions defined in

FOR and LET clauses in XQuery queries. WHERE Proc-

essor and RETURN Processor are used for processing

conditions defined in a WHERE clause and for construct-

ing the result structure defined in RETURN clause, re-

spectively. Index Analyzer parses the index files and gen-

erates the information for obtaining result metadata frag-

ments from the storage by using the selected index. Result

Composer constructs the final result using the result struc-

ture and result metadata fragments.

5. Performance Evaluation

In order to evaluate whether our choice of the strategies

for the issues is relevant, we compare our prototype sys-

tem with other general-purpose XQuery Engine and test

their performance for various typical queries. We select

two popular general XQuery Engines. One is the Oracle

XQuery Engine [10]. The other is a Saxon-B XQuery En-

gine [11]. Two XQuery Engine it is all free source, a

JAVA base, and a head of a family general XQuery En-

gine. The experimental setup is as follow: the CPU is Intel

Pentium III Process 750 MHz, the memory size is 256

MB, the JDK version is 1.4 and the OS is LINUX 2.4.2.

Our system uses XQuery, which is a sub set of XQuery

1.0 (e.g. is not support ‘OR’ in WHERE clause and ‘//’ in

XPath path express). From the previous work [4, 15, 16,

17], we have found that the query processing performance

depend on the XPath expression, number of predicates,

and result size. By considering these factors, I use the

XQuery in Table 1.

We omit some expressions in example queries except

Q1. For example, the constructor ‘<Results>’ is omitted

because that is the same as in Q1. The queries Q1, Q2 and

Q3 use single condition which is declared in the WHERE

TV-Anytime Metadata

Metadata

Generator Metadata

Editor

InsertDoc

DeleteDoc

Storage Engine

UpdateDoc

GetDoc

dex

30 Storing and Searching Metadata for Digital Broadcasting on Set-Top Box Environments

Table 1. Example XQuery queries for experiment

WHERE Conditions / RETURN Value

XQuery query

Single condition/ Single terminal node

<Results>{

for $d in input("TVAnytime")

return <Result>{

for $p1 in

$d/TVAMain/ProgramDescription/

ProgramInformationTable/ProgramInformation

for $p2 in $p1/@programId

for $p3 in $p1/BasicDescription/Title

where $p2="crid://www.kids17.net/amigonme

103042200049"

return <node>{ $p3 }</node>

}</Result> }</Results>

Single condition/ Single root node

for $p1 in $d/TVAMain

for $p2 in

$p1/ProgramDescription/ProgramInformation

Table/ProgramInformation/@programId

where $p2="crid://www.kids17.net/amigonme

103042200049"

return <node>{ $p1 }</node>

Single condition/ Multiple terminal nodes

for $p1 in $d/TVAMain

for $p2 in

$p1/ProgramDescription/ProgramInformationTable/

ProgramInformation/BasicDescription/Genre/Name

where $p2="Education"

return <node>{ $p1 }</node>

Three conditions/ Single root node

for $p1 in $d/TVAMain

for $p2 in $p1/ProgramDescription/ProgramInformation

Table/ProgramInformation/BasicDescription

for $p3 in $p2/Language

for $p4 in $p2/ProductionDate/TimePoint

for $p5 in $p2/ReleaseInformation/ReleaseDate/

DayAndYear

where $p3="ko" and $p4>="2006"

and $p5="2006-04-14"

return <node>{ $p1 }</node>

Five conditions/ Multiple root nodes

for $p1 in $d/TVAMain

for $p2 in $p1/ProgramDescription/ProgramInformation

Table/ProgramInformation/BasicDescription

for $p3 in $p2/Language

for $p4 in $p2/ProductionDate/TimePoint

for $p5 in $p2/ReleaseInformation/ReleaseDate/

DayAndYear

for $p6 in $p1/ProgramDescription/ProgramLocation

Table/BroadcastEvent/Live/@value

for $p7 in $p1/ProgramDescription/ServiceInformation

Table/ServiceInformation/Name

where $p3="ko" and $p4>="2006" and

$p5="2006-04-14" and $p6="true" and $p7="KBS"

return <node>{ $p1 }</node>

Three conditions/ Multiple terminal & root nodes

for $p1 in $d/TVAMain

for $p2 in $p1/ProgramDescription/ProgramInformation

Table/ProgramInformation/BasicDescription

for $p3 in $p2/Title

for $p4 in $p2/Language

for $p5 in $p2/ProductionDate/TimePoint

for $p6 in $p2/ReleaseInformation/ReleaseDate/

DayAndYear

for $p7 in $p1/ProgramDescription/ProgramLocation

Table/BroadcastEvent/Live/@value

for $p8 in $p1/ProgramDescription/ServiceInformation

Table/ServiceInformation/Owner

for $p9 in $p1/ProgramDescription/CreditsInformation

Table/PersonName

for $p10 in $p1/ProgramDescription/ServiceInformation

Table/ServiceInformation/ParentService

where $p3="KBS News 9" and $p4="ko" and

$p5>="2006" and $p7="true" and $p8="KBS"

return <node>{ $p3, $p4, $p5, $p6, $p9, $p10 }</node>

Figure 8. Comparison of query processing times

clause. However, the result data sizes are expected differ-

ent because the result of each query is a leaf node, an root

node, and multiple root nodes together with their descen-

dent nodes, respectively. Q4, Q5 and Q6 use different

number of conditions. The return value of each query is a

single root node, multiple root nodes, and multiple termi-

nal and root nodes, respectively.

Figure 8 summarizes the performance. The numbers of

the test data are 50 and 200 TV-Anytime metadata in-

stances respectively. The result shows that our system

outperforms other methods for any queries except Q6. In

case of Saxon B and Oracle, the complex queries Q4 and

Q5, takes more execution time than simple query Q1, Q2,

and Q3. However, our system does not so depend on the

queries. In case of our system, Q6 takes more execution

time than the other queries since we need time to compose

result. However the case of Q6 is not general, because the

result size of user queries is not large volume in a Set-Top

Box, generally.

Figure 9 summarizes the scalability property of the sys-

tems. The size of the test data is 50 documents, 100

documents, 150 documents and 200 documents, respec-

tively. In case of Saxon B and Oracle, the processing time

increases linearly as the number of data increases. How-

ever, the processing time of our system is independent of

the data size for searching. The result of the evaluation

shows that our system outperforms so that our approach is

believed to be on of the efficient approaches for managing

metadata in the Set-Top Box.

50 documents

Saxon B

Oracle

Our System

14,000

12,000

10,000

8,000

6,000

4,000

2,000

Time (milliseconds)

Queries

200 documents

14,000

12,000

10,000

8,000

6,000

4,000

2,000

Time (milliseconds)

Saxon B

Oracle

Our System

Storing and Searching Metadata for Digital Broadcasting on Set-Top Box Environments 31

Figure 9. Performance evaluation for scalability property

6. Conclusions

In this paper, we have proposed a method for storing and

searching TV-Anytime metadata for digital broadcasting

based on a Set-Top Box which is low-cost and low-setting.

Also we have implemented a prototype system for apply-

ing our method and evaluated our approach which seems

important since our prototype system outperforms the

other compared systems. Our system was developed on

digital broadcast environments [18]. However our result

can be applied to any XML management systems that fo-

cus on the performance of store and retrieval on low-cost

environments.

7. Acknowledgement

This research is supported by MKE & IITA(08-

Infrastructure-13, Ubiquitous Technology Research Cen-

ter), and also by Foundation of ubiquitous computing and

networking project (UCN) Project, the Ministry of

Knowledge Economy (MKE) 21st Century Frontier R&D

Program in Korea and a result of subproject UCN 08B3-O1-

30S.

14,000

12,000

10,000

8,000

6,000

4,000

2,000

Time (milliseconds)

Saxon B

Oracle

Our System

Saxon B

Oracle

Our System

14,000

12,000

10,000

8,000

6,000

4,000

2,000

Time (milliseconds)

100

150

200

Number of documents

100

150

200

Number of documents

100

150

200

Number of documents

100

150

200

Number of documents

Saxon B

Oracle

Our System

Saxon B

Oracle

Our System

14,000

12,000

10,000

8,000

6,000

4,000

2,000

Time (milliseconds)

14,000

12,000

10,000

8,000

6,000

4,000

2,000

Time (milliseconds)

Saxon B

Oracle

Our System

14,000

12,000

10,000

8,000

6,000

4,000

2,000

Time (milliseconds)

100

150

200

Number of documents

Saxon B

Oracle

Our System

14,000

12,000

10,000

8,000

6,000

4,000

2,000

Time (milliseconds)

100

150

200

Number of documents

32 Storing and Searching Metadata for Digital Broadcasting on Set-Top Box Environments

REFERENCES

[1] S. Pfeiffer and U. Srinivasan, “TV anytime as an applica-

tion scenario for MPEG-7,” In Proceedings ACM Multi-

media 2000, Los Angeles, October 2000.

[2] “TV-anytime phase 1,” Part 3 Metadata, ETSI TS 102

822-3-1, Vol. 1.1.1, October 2003.

[3] TV-Anytime Forum Website: http://www.tv-anytime.org.

[4] J. H. Park, J. H. Kang, B. K. Kim, Y. H. Lee, M. W. Lee

and M. O. Jung, “An XQuery-based TV-anytime metadata

management,” Proceedings of DASFAA’05 Conference, April

2005.

[5] H. S. Shin, “A storage and retrieval method of XML-

based metadata in PVR environment,” IEEE Transactions

on Consumer Electronics, Vol. 49, No. 4, pp. 1136-1140,

November 2003.

[6] D. Florescu and D. Kossmann, “Storing and querying xml

data using an RDBMS,” IEEE Data Engineering Bulletin,

Vol. 22, No. 3, 1999.

[7] I. Tatatinov, S. D. Viglas, K. Beyer, J. Shanmugasunda-

ram, E. Shekita, and C. Zhang, “Storing and Querying or-

dered XML using a relational database system”, Proceed-

ings of ACM SIGMOD Conference, June 2002.

[8] ORACLE, “Berkeley DB introduction,” http://www.oracle.

com/database/berkeley-db/.

[9] T. Fiebig, S. Helmer, C. C. Kanne, J. Mildenberger, G.

Moerkotte, R. Schiele, and T. Westmann, “Anatomy of a

Native XML Base Management System,” Technical Re-

port 01, University of Mannheim, 2002.

[10] ORACLE, “Oracle XML data synthesis or XDS,”

http://www.oracle.com/technology/tech/xml/xds/.

[11] SAXONICA, “SAXON XQuery Engine,”

http://www. saxonica.com/.

[12] D. Florescu and D. Kossmann, “Storing and querying

XML data using an RDBMS,” IEEE Data Engineering

Bulletin, Vol. 22, No. 3, pp. 27-34, September 1999.

[13] I. Tatarinov, S. D. Viglas, K. Beyer, J. Shanmugasunda-

ram, E. Shekita, and C. Zhang, “Storing and querying ordered

xml using a RDB system,” Proceedings ACM SIGMOD Con-

ference, June 2002.

[14] M. Yoshikawa, T. Amagasa, T. Shimura, and S. Uemura:

“XRel: a path-based approach to storage and retrieval of

XML documents using RDBs,” Proceedings ACM Trans-

actions on Internet Technology, Vol. 5, August 2001.

[15] T. Grust, “Accelerating XPath location steps,” Proceedings of

the ACM SIGMOD Confefence, pp.109-120, June 2006.

[16] M. Barg and R. K. Wong, “A fast and versatile path index

for querying semi-structured data,” Proceedings of the

DASSFAA’03 Conference, pp. 249-256, March 2003.

[17] S. Hidaka, H. Kato and M. Yoshikawa, “A relative cost

model for XQuery,” Proceedings of the SAC’07 Confer-

ence, March 2007.

[18] K. Kang, J. G. Kim, H. K. Lee, H. S. Chang, S. J. Yang, Y.

T. Kim, H. K. Lee, and J. W. Kim, “Metadata broadcasting

for personalized service: A practical solution,” ETRI Jour-

nal, Vol. 26, No. 5, pp. 452-466, October 2004.