In the era of big data, large amounts of data affect our work, life and study, even national economic development. It provides a new way of thinking and approaches to analyze and solve problems, which gradually becomes a hot research. Based on describing the concept and characteristics of big data, this paper describes the development of technologies in big data analysis and storage and analyses the trends and different values in commercial applications, manufacturing, biomedical science and other applications. At last, the authors sum up the existent challenges of big data applications and put forward the view that we should deal with big data challenges correctly.
Before big data appear, database has become an important processing platform because of the data processing convenience. But when database is faced with non-relational or large-scale data, there is a difficulty dealing with them. Big data not only enhance the related computing services technologies but also change the traditional mode of many industries. The latest report released by Markets and Markets shows that [
Big data are the hottest words in the IT industry, followed by data warehouse, data analysis and data mining. The commercial value of the using big data gradually becomes the focus profits of different professionals. Big data help people acquire knowledge from the massive, complex data, and become another focus after integrated circuit and Internet information technology. IBM, Amazon, Microsoft and other large companies are constantly committed to develop and utilize big data, triggering the development boom of big data.
The original concept of the idea of big data is from the world of computer science and econometrics [
There are many kinds of definitions of big data. Wikipedia points out that big data usually includes data sets with sizes beyond the ability of commonly used software tools to capture, curate, manage, and process data within a tolerable elapsed time [
The characteristics of big data are submitted by Victor Meyer-Schonberg in “BIG DATA”. There are four characteristics, including volume, velocity, variety, value [
Volume refers to the huge amount of data. With the development of data storage and network technology, data storage expands from TB to ZB. Only in 2011, 1.8ZB (1.8 trillion GB) of data are created. Warehouse management server will increase 10-fold to 50-fold to cater to the growth of big data. Velocity refers to the mobility of data streams. It is difficult to deal with data in a traditional way because data run fast. Through cloud computing, it can achieve fast data processing. Variety refers to relational and non-relational data generated by a variety of ways. With the development of mobile networks, people are more widespread to use real-time data. The quantity of semi-structured and non-relational data is also increasing. Value reflects the value of the significance of big data applications, which has a scarcity value, uncertainty and diversity.
All in all, although the definition of big data contains different concerns and technologies, but there is a consensus point, big data refers not only too large amounts of data, but also including a large amount of data processing techniques. “4V” characteristics show a large number of data. Volume, velocity and variety are aim to realize the value of big data. Data collection, storage, analyze, is prepare for dig out the value of data. Big data emphasizes complexity in data analysis, and it pays more attention to data processing efficiency and the data value.
From an economic development perspective, many large companies focus on big data seriously. IDC’s report claimed that global data will increase by 50 times over the next decade, shown in
The development trend of big data published in “2012 Hadoop and Big Data Technology Conference” showed the top three topics are: data resources, big data privacy issues and integration of big data and cloud computing. The magazine editor of Wired, Chris, has asserted that the data have made the traditional scientific method obsolete. Although this statement is a bit extreme, but big data indeed has changed our lives, our way of thinking. Big Data is widely used. Now many large companies use big data to streamline processes and create efficiencies, such as Microsoft, Apple, Oracle, Amazon, Google, FaceBook and Twitter. They are experienced in dealing with big data sets [
Big Data provides a new method to traditional data analysis, which has a variety of technologies, including Hadoop and MapReduce, cloud computing, grid computing and so on. This paper sorts out the following technologies.
In the related technologies, more representative one is Hadoop, which is represented by non-relational data
analysis techniques. By the virtue of processing for non-structural, massively parallel processing, easy using and other advantages, Hadoop becomes a mainstream technology. MapReduce is a model proposed for parallel processing and generating big data by Google in 2004 [
In addition to the requirements of efficiency and speed, big data collection also requires security. A general data acquisition engine which combines rule engine and finite state automaton together, helps to verify the security and correctness of the big data acquisition flow [
Big data processing system requires some related components to use in parallel multiple instances of the same task, so as to achieve the desired level of performance applications. In order to enable administrators and developers to maintain the growth rate of the data, these systems’ reliability assessment is critical. A set of methods for approximate inference of probabilistic models, based on MFA, can solve the performance evaluation system problem of big data [
In addition to the above-mentioned techniques, M2M (Machine To Machine) technology is an important one. M2M platform can expand the number of data producers and data consumers flexibility, accomplish new services in a very short period of time, re-use and combine data from different sources. Existing studies have shown that automatically creating M2M decision support system has much room for development [
The arrival of big data change many applications, including business, traditional manufacturing, biomedical field and other applications. Big data brings opportunities to the enterprises. Previously untapped data resources can be stored and processed. The new data collection techniques and advanced data mining tools provide an unprecedented opportunity. This paper analyzed from business applications, manufacturing, biomedical industries and other industries.
Business studies show that timely and effective use of data-driven knowledge is a competitive advantage. Combined with cloud computing, using LDA to extract themes can provide usefulness for unstructured data, can help companies to export the competitive advantage [
As a traditional industry, manufacturing is also under attack due to the advent of the big data. Big data may push the next revolution in manufacturing―forecast manufacturing. In order to become more competitive, manufacturers need to accept emerging technologies, such as advanced analysis and physical network [
Big Data is changing the biomedical industry with bringing benefits for human being. According to the McKinsey Global Institute, effective use of big data will help the US health care sector save $300 billion per year in savings, reduce spending by 8% [
In addition to the above-mentioned applications, big data is also available in the study of history and geography. Historians have been making sense of reams of data for centuries. Now computational tools, along with a proliferation of digital source materials [
Overall, in business applications, big data are changing the way of exporting competitive companies’ advantage, identifying fraud risks and analyzing trends and patterns. In traditional manufacturing industries, big data can control the quality of products, reduce defects and product costs and track every aspect of the product. In the biomedical industry, the development of big data changes the way of treatment, so that more patients can get a better solution. In the study of history and geography, big data also appeared in the historical archives research and geography climate research. With further research and the attention of big data, big data will become increasingly widespread.
In the past decade, the development of the Internet has produced a lot of data. History shows that the availability of new and timely data does not always bring expected results. It also has shortcomings, such as data privacy, lack of professional, data leakage of personal information and other challenges.
The biggest challenge of big data is data privacy. It issues across the entire life cycle of big data: collection, combination, analysis and use. Development of big data puts forward a series of challenges that requires careful balance between threats and opportunities [
Based on the definition and characteristics of big data, this paper concludes the development trend of big data. As a hot research, the company has interests in it and gets some results gradually. The analysis of big data processing technologies is also deeper. People use Hadoop and MapReduce technology to enjoy the big data applications in commercial, manufacturing, biomedical industry and other aspects of the benefits. At the same time, there are also some challenges like data privacy, lack of professional and data leakage. Despite the great convenience to people’s lives brought by big data, it is a double-edged sword yet. If the way of data management doesn’t change, it will bring great challenges. Therefore, business managers need to carefully comprehend and correctly respond to the challenges.
This study is funded by National Natural Science Foundation of China: 71171062.