Managing Social Security Data in the Web 2.0 Era
Li Luo1, Hongyan Yang2, Xuhui Li3
1College of Literature, Law & Economics, Wuhan University of Science and Technology, Wuhan, China; 2School of Political Sci-
ence & Public Administration, Wuhan University, Wuhan, China; 3State Key Lab of Software Engineering, Wuhan University, Wu-
han, China.
Received May 23rd, 2012; revised June 23rd, 2012; accepted July 23rd, 2012
Social security data management is an important topic both in application of information management and in social se-
curity management. In the Web 2.0 era, more and more human information and healthcare information is released to the
Internet through various approaches. This abundance makes managing social security data go beyond managing con-
ventional social security database records. How to organize the conventional records together with the related informa-
tion gathered from the Web is an interesting prob lem to solve to provide more convenient and powerful social security
information service. In this paper, we introdu ce our initial work on building a Web-oriented social security information
system named i-SSIS. I-SSIS is a database system which adopts a new object-role data model named INM model and
deploys INM database system as its core. With the assistance of auxiliary tools to carry out social security information
extraction, analyzing and query, i-SSIS can properly provide social security-related information gathered from the Web.
We introduce the basic ideas of designing i-SSIS and describe the architecture and major components of the system.
Keywords: Human Resource Management; Social Security; Data Management; Information System
1. Introduction
Managing social security data has a long history in data
management, especially in e-Government. In the early
days of data management [1], US government has de-
signed a well-defined database to manage social security
data and dwelled on the major issues of building such a
system. Although social security data management in-
volved many practical prob lems at that time, it was still a
typical application of database system or, more precisely,
management information system where a well-designed
database played a central role and the key technical prob-
lems were to organize the data and to design the queries
for common applications.
When the Web 2.0 era comes, lots of things in infor-
mation management are changed. There is an abundance
of data in all fields, and the approaches of providing,
sharing and utilizing the data has varied a lot. In the field
of social security data management, the technical key
point is migrating from organizing data and queries to
gathering and utilizing data. That is, although the central
component here is still the social security records, a lot of
information in the Web related to people and other so cial
security issues is available and can be utilized together
with the social security records. More knowledge can be
discovered in the combination of them, and be further
used for governments and organizations in making deci-
sion. For example, as an important portion of social se-
curity data, health care information was usually managed
in the same way as other kinds of data. However, in
practical applications, it is fairly common for people to
know the associations among the specific group of peo-
ple, the diseases, the therapies and the expenses. The in-
formation here involves not only the pure social security
records, but also the data about people’s career back-
ground, the clinical information, and so on. The extra in-
formation was hard to get in the pre-Internet era, how-
ever, it is not difficult to gather and analyze in the Web
2.0 era since lots of people are sharing their personal
information in blogs and various kinds of medical infor-
mation can be explored and gathered from the Web, e.g.,
Wiki. Managing social security data now seems going
beyond managing social security records. It is becoming
a task of integrating and utilizing various kinds of data
involved in human and social security and providing va-
rious information services for the people who concerns
issues related to social security. In this task the Web
plays an important role.
To accomplish the task mentioned above, data man-
agement approaches and tools need to be improved and
enhanced. However, up to now, there is still a lack of
study considering the challenges brought about by the
Managing Social Security Data in the Web 2.0 Era 223
new task, and seldom researchers in the fields of data
management and social security propose a feasible and
practical way of establishing a scheme to provid e the so-
cial security information service which embodies the fea-
tures of the Web 2.0 era.
In this paper, we propose a framework of a new social
security information system for the Web 2.0 era. In this
system named i-SSIS standing for Internet-oriented So-
cial Security Information System, data management of
social security is no longer based on a conventional da-
tabase. We deploy a new database system that can man-
age the human resource information and other informa-
tion related to social security. Based on this novel system,
many new value-added information services for social
security can be developed, and among them the social
security search service is what we are devoting to build.
The rest of the paper is organized as follows. In Sec-
tion 2 the related works about social security data man-
agement and utilization is briefly discussed. Section 3
illustrates a scenario indicating how the information
sources in the Web affect the social security. In Section 4,
the architecture of i-SSIS is illustrated and the major
components are introduced. In Section 5, the ongoing
work on searching service of social security information
is introduced. Section 6 concludes the paper.
2. Related Works
Research on social security data management and utiliza-
tion can trace back to the late of 1970s [1,2] and keep
progress with the development of the studies in both so-
cial security and information technology. Nowadays, go-
vernment departments and companies involving the af-
fairs of social security all have their databases and infor-
mation systems of social security. Various kinds of stud-
ies in the information technology fields have been work-
ed on social security data.
Some studies on social security data mining try to ana-
lyze the social security data to find patterns of social se-
curity related affairs, such as debt [3,4] and health care
[5]. Some studies concentrate on protecting social secu-
rity data [6] since they are often confidential and should
be accessed only for specific use.
Although social data are usually stored and managed
by conventional database system, there is a new tendency
of developing information management for social secu-
rity data to cater for curren t technolog y of data utilizatio n
such as data mining. For example, studies are working on
developing data warehouse for healthcare data for effi-
cient mining [7].
Another relevant field with rapid development is hu-
man resource management [8] where people are building
more comprehensive and efficient information system to
collect and manage various kinds of human resource in-
Inspired by these trends in social security data man-
agement and utilization, we propose a new information
system to combine the human resource information and
conventional security data and thus can be used for vari-
ous kinds of people, e.g., managers, researchers and offi-
cials, to access and utilize the data with specified au-
thorizations and privileges. This system named i-SSIS is
oriented to Web 2.0 resources from which we can gather
the information of people directly and link the relevant
data to form a rich information network. By integrating
social security data with the human resource information
gather from the Web, various kinds of data analysis can
be carried out to provide rich social security information
3. A Scenario
In conventional social security data management, the da-
tabase stores the basic information of each person, such
as the career and education background, healthcare re-
cords, salary and pension records, etc. These records can
provide important information for administrators or re-
searchers to make some basic statistics and decisions.
However, if we want to know more about the relation-
ships between certain information involved in social se-
curity, conventio nal records are often not enough.
For example, a researcher in social security field wants
to investigate the situation of commercial medical insur-
ance in certain groups of people in China. He firstly
chooses the teachers and clerks in universities and col-
leges as the object of study. These people are often cov-
ered by public medical care; however, the public medical
care is often not enough for them due to the lack of fi-
nance, and thus many teachers choose various kinds of
commercial medical insurance as a complementary in-
surance. This situation makes them as a good study ob-
ject of investigation.
In conventional social security records, the informa-
tion about the people and the health care is quite plain.
Usually only the disease name, the fee and the security id
are recorded. However, for a deep investigation, the re-
searcher especially wants to know the career environ-
ment or, more specifically, the research background of
the teachers in universities would affect their health and
how they would choose the commercial medical insur-
ance accordingly. The information required to accom-
plish such an investigation is scarcely stored in conven-
tional social security database. Therefore, he needs to
explore various kinds of information about the teachers
manually. Fortunately, most of information involved can
be found from the Web. For example, he can find the
education and research background of the teachers in
their homepages, the research and the health information
from their blogs. He can also find the diseases and thera-
pies from the Web, and all kinds of medical insurance
Managing Social Security Data in the Web 2.0 Era
4.1. Modelling Social Security Data with INM information from the Websites of insurance companies.
Now the problem is that it is a big burden for him to
gather, sort and analyze the information from the Web
since he is not an expert in computer engineering. There-
fore, he needs help from a technician in the computer
field, or he can resort to i-SSIS to find the information.
INM is an object-role model which can expressively
specify the attributes, roles and relationships of entities in
the real world. It supports role-relationship class and
class inheritance which can effectively present the com-
plex networked semantics of entities. It can present the
information of real-world entities by associating each of
them to a single object with a unique oid.
I-SSIS is an information system which gathers infor-
mation related to social security from the Web and pro-
vides typical information services to end users in the so-
cial security field. Like a search engine, i-SSIS uses the
data crawled from the Web, and then an alyze it to extract
social security information, human resource information,
etc. It deploys certain data model to organize the ex-
tracted information and uses a database system to store
and query the information. Therefore, the social security
data in the Web can be conveniently managed and util-
ized under i-SSIS.
In INM, an object is represented as a tree. The root of
the tree is a list of object names associated with the ob-
ject and each sub-tree corresponds to a property. An INM
instance database is a set of classified objects.
Figure 1 illustrates a sample of INM instances which
contains information about the uni versit y WHU, the course
ADB, and several people Bob, Amy, Ada, Ann. The object
WHU has two relationship hierarchies with roots Faculty
and Student. The relationship Faculty is specialized into
Prof with value Bob and Lecturer with value Amy. The
relationship Student is specialized into UnderGrad with
value Ada and GradStudent which is further specialized
into M.Sc with value Ann and PhD with value Amy.
4. Architecture of I-SSIS
I-SSIS is a database system which gathers, organizes and
manages the social security data from the Web. The da-
tabase system is built upon a database management sys-
tem INM-DBMS as its backbone and uses certain auxil-
iary tools to provide various functions. The reason we
chose INM-DBMS to manage social security data is that
INM-DBMS adopts a novel data model named Informa-
tion Network Model (INM) [9,11] which can easily asso-
ciate the data about an entity and thus is appropriate and
convenient to describe and manage the data of people
gathered from the Web. In this section, we firstly intro-
duce utilizing the basic features of INM to model social
security data and then describe the architecture of i-SSIS.
I-SSIS is built upon an INM instance database which
is designed deliberately for social security data manage-
ment. In i-SSIS the entities can be classified as 4 categ o-
ries of objects: Person, Role, Organization and Insurance.
A Person object represents a person entity in real world,
including the attributes which is concerned in the social
security such as birthday, identity (or social security)
number, living place, etc. A Role object represents a role
which a person acts in certain circumstance and an Or-
ganization object represents an organization, e.g., a com-
pany, an institution or an association, in which a Person
Figure 1. A sample of INM instances.
Managing Social Security Data in the Web 2.0 Era 225
plays a certain role. For example, a person Bob is a fac-
ulty member in a university; meanwhile he is also an
athlete in his spare time and belongs to a sport associa-
tion. As INM indicates, the role can have related attrib-
utes which the related person pertains. Therefore, the
objects Person, Role and Organization can work together
to establish a full background of the people to be con-
cerned. An Insuran ce object represents a kind of social
security insurance such as pension or healthcare.
4.2. Architecture of I-SSIS
As a system to gather and manage social security data,
i-SSIS has a 3-layer architecture to undertake the func-
tionalities of gathering, managing and serving in each
layer respectively, as Figure 2 shows.
Under i-SSIS is the raw data in the Web, which is
crawled and processed by the modules in the Data Col-
lection Layer. The crawlers in i-SSIS fetch the pages
from the Web sites to gather data involving human re-
source and social security information. The crawlers are
embedded with some inner analyzers to filter the unnec-
essary information during crawling, which means that
only the documents involving social security information
such as the person, the organization, the role and the in-
surance are collected. Then the raw data would be proc-
essed through three procedures to become the entities
managed by the database syste m. Firstly, the information
of i-SSIS entities, e.g., people and organizations, is ex-
tracted with information preprocessing tools such as
natural language processing tools. In this procedure, Web
documents are initially summarized with a statistics tool
to find its theme and then are processed by information
Figure 2. Architecture of i-SSIS.
extraction tool to get the rough information about i-SSIS
entities. For example, when the blog of a person Bob is
processed, the biography, the affiliation, the occupation
and the important social relations are extracted to con-
struct his basic information. Secondly, various informa-
tion of each entity is id entified and conden sed to generate
entity objects in i-SSIS, as the data collection result of
entities. As previously mentioned, in INM the entity is
presented as a single object. However, it is common in
information extraction that a single entity, e.g., a person,
is described in different document fragments from dif-
ferent aspects. Therefore, in this layer, data mining tools
and other analyzing tools are used to combine the infor-
mation about distinct entities and formulate it into a pre-
defined schema of i-SSIS entities. Thirdly, the entity and
relationship data collected from the Web are integrated
with the conventional social security data. The latter is
collected from social security databases through common
interfaces or extraction tools of deep Web. By combining
the data on entities, the integrated data warehouse can
provide a consistent and comprehensive description to
the entities involved in social security. After the three
procedures, the Web-oriented social security data is fi-
nally gathered and provided to the Data Management
Processing the raw data to get social security related
information is fundamental in build ing a practical i-SSIS.
The research and implementation of the tools in this
stage is being undertaken, and we have already built a
prototype which can semi-automatically gather and pro-
cess the information of the people in education organiza-
tions such as universities because of the abundance of
Web documents about the people and organizations in
this field. The crawlers gather the documents from the
universities, the research institutes, the homepages and
the sample social security databases. With the raw data,
the i-SSIS entities, i.e., the people, the organizations, the
social security records, etc., are recognized and fused to
reflect the entities in real world. However, since the in-
formation extraction and analyzing tools are quite diffi-
cult to be customized to fit for the p ractical work. Lots of
work has to be undertaken manually. We are improving
the tools to make data collection more efficient.
In the Data Management Layer, i-SSIS directly de-
ploys INM-DBMS to store, manage and query the social
security data provided by the under layer. The informa-
tion in i-SSIS usually lies in the four categories of INM
objects as described in the last section, and their storage,
query and index are managed by INM-DBMS. To make
the database efficient to social secu rity d ata manag ement,
the INM-DBMS module in i-SSIS is especially custom-
ized to speed up the common data manipulations. On one
hand, the data storage is optimized to cluster the data
about entities if possible, because there are often query
Managing Social Security Data in the Web 2.0 Era
requests to find associated attributes on people or or-
ganizations in practice to fetch related social insurance
data. On the other hand, temporal data manipulation, a
common feature of human resource information man-
agement, is also allowed here and the storage and the
index are optimized to make the temporal queries to be
processed more convenient and quickly.
In the Social Security Information Service Layer, a
query interface named SSQ, standing for Social Security
Query, is provided as the major component of the Hu-
man-Computer interface of i-SSIS. SSQ is much more
than a simple application of INM-DBMS’s query [10]
interface because it uses a temporal query language
named HRQL as the intermediate query language for the
data management layer. HRQL can easily present query
requirements on temporal information of person entities.
This query language is an extension of our previous stu-
dy [12], and the featur e of entity—object correspondence
of INM-DBMS is properly utilized in processing the
HRQL. The SSQ interface transforms the user’s query
requests to the typical queries implemented in HRQL,
and then forward the parsed queries to the under layers.
Besides the query service, a novel social security infor-
mation searching service is proposed in establishing
i-SSIS and discussed in brief in the next section.
5. Searching Service of I-SSIS
Social security information service is a special part of so-
cial security data management because it provides users
the approaches to utilize the social security data in prac-
tice. Conventionally, the information service is only the
common query service provided by the database system.
However, in the Internet era, everything is associated
with searching. Therefore, it is impor tant and challenging
to build a searching service especially for social security
information to make the work more efficient and con-
In i-SSIS, we propose and design a social security in-
formation searching service under the assistance of the
Pluto searching engine [13] which is supported by INM-
DBMS to undertake the INM data searchin g. The se arch-
ing service deploys a structured data searching model.
The model specifies the Steiner-trees containing all the
keywords as the searching results. It ranks the results
according to the compactness of the tree, the authority of
the nodes and the redundancy of information. This sear-
ching model can enable the users not familiar with the
INM model to utilize the social security information in
i-SSIS without being aware of the networked structure of
underlying entities and relationships. It can not only in-
crease the quality and precision of searching results but
also present users useful semantic relationships and con-
text information of entities.
The searching service utilizes a heuristic searching ap-
proach based on pruning matching nodes. This approach
can enhance the searching efficiency by pruning the top-
k matching nodes which have minor possibilities. Based
on this approach, the underlying INM-DBMS in i-SSIS
deploys a special index system to support pruning mat-
ching nodes. The index system collects the neighbor-
hood information of the nodes and utilizes it to calculate
latent matching nodes. The undergoing prototype shows
that the searching service can provide users a novel and
efficient experience of utilizing social security data dis-
tinct from the conventio nal one.
6. Conclusions
Social security data management is an important topic
both in application of information management and so-
cial security management. In the Web 2.0 era, more and
more human information and healthcare information is
released to the Internet through various approaches. This
abundance makes managing social security data go be-
yond managing conventional social security database re-
cords. How to organize the conventional records together
with the related information gathered from the Web is an
interesting problem to solve to provide more convenient
and powerful social security information service.
In this paper, we introduce our initial work in build ing
a Web-oriented social security information system named
i-SSIS. I-SSIS is a special database management system
INM-DBMS which is used to describe and manage the
real world entities using a data model named INM. Un-
der the support of INM-DBMS and other preparation and
query tools, i-SSIS can efficiently manage the social se-
curity information and provide useful service for com-
mon or specific purposes. Up to now, i-SSIS is still in its
initial design phase, and we are working at building effi-
cient information extraction tools to gather and analyze
the social-security related information from Web re-
sources. In the next step we would integrate the social
security extraction system with i-SSIS and to provide a
prototype for searching and utilizing the social security
7. Acknowledgements
This paper is partially supported by the Key Research
Funds of Hubei Small and Medium-sized Enterprise Re-
search Center under contract No.WH2011001 and the
National Social Science Foundation of China under con-
tract No.09CZZ032.
