**Journal of Financial Risk Management**

Vol.06 No.02(2017), Article ID:77123,28 pages

10.4236/jfrm.2017.62013

Research on P2P Network Loan Risk Evaluation Based on Generalized DEA Model and R-Type Clustering Analysis under the Background of Big Data

Ximing Lv^{1,2}, Lan Zhou^{3}, Rui Zhang^{3}, Xiaona Guo^{3}^{ }

^{1}School of Mathematical Sciences, Inner Mongolia University, Hohhot, China

^{2}School of Statistics and Mathematics, Inner Mongolia University of Finance and Economics, Hohhot, China

^{3}School of Finance, Inner Mongolia University of Finance and Economics, Hohhot, China

Copyright © 2017 by authors and Scientific Research Publishing Inc.

This work is licensed under the Creative Commons Attribution International License (CC BY 4.0).

http://creativecommons.org/licenses/by/4.0/

Received: April 29, 2017; Accepted: June 20, 2017; Published: June 23, 2017

ABSTRACT

Internet financial risk is not only directly related to the operation and development of the Internet financial system itself, but also has a very important impact on the country’s macroeconomic operation because of its rapid development speed and growing scale of development. As of February 2017, there were 2335 network loan platforms, among which 55 platforms for problem existed. The event, similar to the platform responsible person absconded with money frequently occurred due to lax supervision, credit risk and so on. Therefore, it is very important to evaluate the financial risks of Internet scientifically. This paper takes the top 100 P2P network loan platform risk controls, obtained the net loan home’s rating authentication, as the main research object. The evaluation index system is structured from three dimensions, respectively as follows: liquidity risk, market risk and credit risk. The R-type cluster analysis is used to reduce the dimension of the index system, and the core index evaluation system is obtained finally. On the basis of this, the risk control capability efficiency of that was evaluated for the first time by the classical DEA-CCR model, and then carried out the excellent, the good, the medium and the poor risk control capacity efficiency rating according to the pre-set step size. The excellent refers to the network loan platforms whose ranking is in the first quarter of the comprehensive efficiency derived by DEA-CCR; non-excellent network loan platform refers to the study of 100 network lending platforms in addition to the excellent lending platform other than the research platform. Taking the Excellent P2P network loan platforms as the reference set and the Non - excellent as the evaluation set, this paper also uses the new generalized DEA model to carry on the research of the “catch-up efficiency” and projection analysis, and obtains the projection value of the non- excellent network lending platform, that is, the improvement value of the non-excellent network lending platform in each research index, and provides a feasible way for the non-excellent P2P network loan platforms to change to the excellent P2P network loan platforms.

**Keywords:**

Generalized DEA, Cluster Analysis, Internet Finance, Risk Evaluation, P2P Network Loan, Big Data

1. Introduction

In the research process of the Internet financial risk assessment model, there are two very critical issues to be resolved: one is to determine the index system, and the other is to determine the modeling method. Regardless of the method used, the selection of indicator data is always necessary. The method of index screening based on cluster analysis proposed in this paper uses the clustering of systems to eliminate the correlation between indicators, which plays an immeasurable role in maintaining the importance of each index to the decide variables. Of all the domestic and foreign scholars, a large part of them committed to the financial risks of the Internet. Afsharian, Ahn and Neumann (2016) discussed the problem of input/output factors from the perspective of goal- oriented, revealing the role of input/output factors in DEA, overcoming factors related to factor determination and dual role factors and the choice of adverse factors. A nonconvex programming model was established and a new efficiency definition was defined to provide scalar measurements of the efficiency of each participating unit, as well as reference to multiple outputs for characterizing these procedures and a plurality of input observation data to objectively determine the weight of the method by Charnes, Cooper, & Rhodes (1978) . The duality of these linear programming models provides a new way to estimate the extreme value relationship from the observed data to describe the link between engineering and economic efficiency methods. Jain and Dube (1988) proposed a clustering analysis algorithm, clustering data in the case of prior data classification, the similarity between similar objects, the difference between different classes, and according to similarity difference data to divide. Liu and Lv (2016) used the new DEA model to carry out the “projection analysis” of the weak units with “catch-up efficiency” less than 1 in the Decision unit, and provided the optimal improvement strategy. Taking the typical network loan platform of P2P loan borrower as the research object, Lin (2015) constructs the Z’-P model which can be applied to the risk measurement of P2P loan platform through qualitative, quantitative analysis and the credit risk characteristics of network loan platform. Lv (2016) expounded the impact of Internet finance on joint-stock commercial banks by constructing the generalized DEA model, and gave the countermeasures and suggestions to improve the innovation ability by using a Tobit model. Ma (2012a) proposed that the generalized DEA method is not limited to the all the effective decision- making units reference set, but also includes more unit comparison information such as average units (such as the enrollment mark), low units (such as the tolerable limit) or some special units (such as the selected samples, standards or some special objects). Ma (2012b) proposed the construction of “generalized DEA” theory and method, and then the theory is further applied to multi- attribute decision-making unit evaluation method, fuzzy comprehensive evaluation method, preferred ranking method, risk assessment method, evaluation combination efficiency method, panel data analysis method, system analysis methods, as well as the field of biophysics and other applications. The analytical model of sample data envelopes with preference cone is given by Ma and Lv (2007) , and the distribution characteristics and projection properties of the decision unit in the sample are analyzed. Ma and Zhao (2016) established the generalized DEA model and extended a necessary and sufficient condition for judging the existence of feasible solutions to the generalized DEA model, and the condition that the unit efficiency is overrated. Effectively solve the problem of effective measurement. Ouyang and Mo (2016) pointed out the daily yield data based on the Internet index and the Shanghai Composite Index, moreover, established the Pareto extreme value distribution model and the historical simulation model under the VaR method to measure the financial risk value of the Internet. Finally, concluded that the Internet risk is greater than the risk of the entire stock market. Sha (2015) , who puts forward the opinions of strengthening the internal risk management, external supervision of the platform, standardizing the development of the industry and promoting the financial innovation through analysing the collapse of P2P network loan platform. Si and Sun (2011) use MATLAB to standardize the data and find out the data index with strong correlation to reduce their dimension. Wang and Shi (2016) used CRITIC-gray relational model to construct the Internet financial risk evaluation system, using VaR method to measure the size of Internet risk. Zhang, Ramakrishnan and Livny (1996) proposed the Birch (Blanced Iterative Reducing and Clustering) algorithm to cluster large-scale data sets. The algorithm is a very effective and traditional hierarchical clustering algorithm, which can be effectively clustered with a scan and can effectively deal with outliers.

This paper discusses the Internet financial risk assessment from the aspects of constructing reasonable evaluation index system, efficiency evaluation and classification, “catch-up efficiency” research and projection analysis. Based on the loan data of 95 representative P2P network loan platforms in the top 100, which obtained the net loan home’s rating authentication (of which five companies due to lack of partial data are not yet considered), on the following aspects of the Internet financial risk assessment issues:

Firstly, select the most influential indicators of risk assessment network loan platform. This paper considers 95 network loan platforms 9 indicators of data in February 2017, including the private sector, the banking, the listed companies, the venture capital sector and the state-owned five departments. According to the correlation coefficient of nine index data, R-type clustering is carried out, and evaluation index data are dimensioned and selected from the six indicators which have the core influence on the Internet financial risk.

Secondly, the initial rating and classification of risk control capability efficiency were carried out. Selecting the selected core indicators as input-output system, the article uses the DEA-CCR model to evaluate the comprehensive efficiency value, pure technical efficiency value and scale efficiency value of the network loan platform, and finally obtains the ranking of network loan platform’s comprehensive efficiency, and dividing the platform into the excellent, the good, the medium and the poor four grades according to the defined step size.

Finally, the study on “catch-up efficiency” and projection analysis were carried out. Taking the excellent network loan platform as the reference set, the other non-excellent P2P network loan platforms is used as the evaluation set. The study uses the generalized DEA method to obtain the relative efficiency of other non - excellent P2P network loan platforms, and obtains the catch-up efficiency by the DEA projection formula, gives the feasible path of the transformation of the non-excellent P2P network loan platforms to the excellent P2P network loan platforms.

2. Construction of Evaluation Index System

This paper chooses the network loan platform as the research object, carries on the wind control evaluation research to it. Because of the rapid development trend and the massive scale of China’s Internet finance, the most important is it has an unpredictable influence on our economy. The central bank has made a regulatory system on the network loan platform continuously, which marks the network loan platform has been basically incorporated into the regulatory system. The network loan platform not only plays a fueled role in the development of Internet finance, but also has a great impact on Internet Finance.

The Selection of Indicators

This paper chooses the indicators with the principles of liquidity risk, market risk and credit risk. At present, China’s network loan platform is mainly engaged in the lending business, therefore, this article with reference to the related regulatory system, selecting representative indexes for evaluation study of the financial risk from different dimensions. China’s network loan platform has been showing a rapid development trend continuously, but there are a lot of confusion hidden behind its high-speed development. Then, depending on the characteristics of Internet financial business, dividing the risk evaluation index system into five risk dimensions: operational risk, national risk, liquidity risk, market risk and credit risk. Internet financial industry develops so far. China has managed Internet financial problems exist effectively by building large data and cloud computing, reduced the domestic risk, and Internet Finance has initially formed a unified standardized operating procedure. Most users get a preliminary understanding of Internet finance, reducing the operative risk between consumers and service providers. The article chooses the regulatory principles of liquidity risk, market risk and credit risk to study the financial risks of the Internet, and fully considers the many unpredictable factors such as transaction index and popularity index, which greatly enhances the flexibility of the Internet financial evaluation system. The Internet financial risk evaluation index system constructed in this paper is shown in Table 1, which is a combination of Basel’s requirements for information disclosure of commercial banks, the enhancement of information transparency and the improvement of information asymmetry, taking into account the impact of average expected return rate, average borrowing period and transparency index on the Internet financial risk, and finally different types of data were selected.

3. Research Methodology

3.1. R-Type Cluster Analysis Model

3.1.1. Data Normalization

Using MATLAB software to assist the model to calculate the average of the data after its standardization

${b}_{ij}=\frac{{a}_{ij}-\overline{{a}_{j}}}{{s}_{j}},i=1,2,\cdots ,n$ (1)

$\overline{{a}_{j}}=\frac{1}{m}{\displaystyle \sum _{i=1}^{m}{a}_{ij}},\text{\hspace{0.17em}}\text{\hspace{0.17em}}{s}_{j}=\sqrt{\frac{1}{m-1}{\displaystyle \sum _{i=1}^{m}{\left({a}_{ij}-\overline{{a}_{j}}\right)}^{2}}}$ (2)

3.1.2. Determine the Variable Similarity Measure

Using MATLAB software to assist the model and find the correlation coefficient between the data. The value of the variable x_{j} is determined by
$\left({x}_{1j},{x}_{2j},\cdots ,{x}_{rj}\right),$
$T\in {R}^{n}$
$\left(j=1,2,\cdots ,m\right)$
. Then can use of sample correlation coefficient of the two variables x and y as its variables with similarity. That is:

${r}_{ij}=\frac{{\displaystyle \sum _{i=1}^{n}\left({x}_{ij}-\overline{{x}_{j}}\right)\left({x}_{ik}-\overline{{x}_{k}}\right)}}{{\left[{\displaystyle \sum _{i=1}^{n}{\left({x}_{ij}-\overline{{x}_{j}}\right)}^{2}}{\displaystyle \sum _{i=1}^{n}{\left({x}_{ik}-\overline{{x}_{k}}\right)}^{2}}\right]}^{\frac{1}{2}}}$ (3)

Table 1. Interne financial risk assessment index system.

Note: The original data is from the network loan home; http://shuju.wdzj.com/ February 2017 monthly data, the author concluded.

3.1.3. Calculate the Similarity Measure

The similarity of the data index is obtained by using the averaging method

$D\left({G}_{1},{G}_{2}\right)=\frac{1}{{n}_{1}{n}_{2}}{\displaystyle \sum _{{x}_{i}\in {G}_{i}}{\displaystyle \sum _{{x}_{j}\in {G}_{j}}d\left({x}_{i},{x}_{j}\right)}}$ (4)

It is equal to the average of the distance between the two sample points, where the number of sample points is the number.

3.1.4. Using Matlab to Draw Clustering Tree

Write code in matlab modeling software:

clear

load yuanshishuju.txt

d = pdist (‘yuanshishuju’, ‘correlation’);

z = linkage (d, ‘average’);

h = dendrogram (z);

set (h, ‘Color’, ‘k’, ‘Line Width’, 1.3)

T = cluster (z, ‘maxclust’, 4)

for i = 1:4

tm = find (T == i);

tm = reshape(tm, 1, length (tm));

fprintf (‘%dth%s\n’, i, int2str (tm));

End

The R-type clustering tree analysis map is obtained, the dimension of the selected data index is reduced, the core data index which affects the borrowing platform is selected, and the input and output index system is established to carry on the concrete analysis.

3.2. Classical DEA - CCR Model

Through the classical DEA-CCR model, the input and output systems was established on the six indexes, among which the registered capital and the average expected yield were taken as the input index, and the leverage index, the dispersion index, the liquidity index and the transparency index were used as the output index, obtained the comprehensive efficiency values of each loan platform and the classification efficiency of the network loan platform. According to the final DEA results, the loan platform listed in this paper is divided into four grades: the excellent, the good, the medium and the poor. Then, efficiency of the network loan platform in the department can be derived according to the frequency of the various departments in each level.

3.2.1. Establishment of Decision-Making Unit

Assuming that there are n decision units, each decision unit should have m types of “inputs” (Indicates the cost of “resource” for the decision unit) and s types of “outputs” (They are some of the indicators that indicate “effectiveness” after the decision-making unit consumes “resources”), the input and output data for each decision unit can be given by Table 2.

Table 2. Input and output data of the decision unit.

In the table,

${x}_{ij}$ is the input of the j-th decision unit to the i-th input; ${x}_{ij}>0$

${y}_{rj}$ is the output of the j-th decision unit to the r-th output; ${y}_{rj}>0$

${v}_{i}$ is a measure of the i-th input (or the right);

${u}_{r}$ is a measure of the r-th output (or the right),

Among them, $i=1,2,\cdots ,m$ , $r=1,2,\cdots ,s$ , $j=1,2,\cdots ,n$ . For the sake of con- venience, sign

${x}_{j}={\left({x}_{1j},{x}_{2j},\cdots ,{x}_{mj}\right)}^{\text{T}},\text{\hspace{0.17em}}\text{\hspace{0.17em}}j=1,2,\cdots ,n,$

${y}_{j}={\left({y}_{1j},{y}_{2j},\cdots ,{y}_{sj}\right)}^{\text{T}},\text{\hspace{0.17em}}\text{\hspace{0.17em}}j=1,2,\cdots ,n,$

$v={\left({v}_{1},{v}_{2},\cdots ,{v}_{m}\right)}^{\text{T}},$ ,

$u={\left({u}_{1},{u}_{2},\cdots ,{u}_{s}\right)}^{\text{T}}.$

3.2.2. Selection of Weighting Coefficient and Establishment of CCR Model

For the weighting factors $v\in {E}^{m}$ and $u\in {E}^{s}$ , (v is M-dimensional real vector, u is S-dimensional real vector), the efficiency evaluation index of decision unit j is

${h}_{j}=\frac{{\displaystyle \sum _{r=1}^{s}{u}_{r}{y}_{rj}}}{{\displaystyle \sum _{i=1}^{m}{v}_{i}{y}_{ij}}}$ (5)

It is always possible to appropriately select the weighting factor u and v so that it satisfies the following condition:

${h}_{j}\leqq 1,\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}j=1,2,\cdots ,n$

when evaluating the efficiency of the ${j}_{0}$ ( $1\leqq {j}_{0}\leqq n$ ) decision unit, with weight coefficients u and v as variables, taking the efficiency index of the ${j}_{0}$ decision-making unit as a target, take the efficiency index

${h}_{j}\leqq 1,\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}j=1,2,\cdots ,n$

Of all the decision-making units as constraint, constitute the following C2R model

$\left({\text{P \xaf}}_{{\text{C}}^{\text{2}}\text{R}}\right)\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\{\begin{array}{l}\mathrm{max}\frac{{u}^{\text{T}}{y}_{{j}_{0}}}{{v}^{\text{T}}{x}_{{j}_{0}}}={V}_{\text{P \xaf}},\\ \text{s}\text{.t}.\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\frac{{u}^{\text{T}}{y}_{j}}{{v}^{\text{T}}{x}_{j}}\leqq 1,\text{\hspace{0.17em}}\text{\hspace{0.17em}}j=1,2,\cdots ,n,\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}v\ge 0,\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}u\ge 0.\end{array}$ (6)

Here “ $\leqq $ ” means that each component is less than or equal to, “ $\le $ ” means that each component is less than or equal to and at least one component is not equal, and “<” means that each component is less than and does not equal.

3.3. New DEA Model―Generalized DEA Model

Take the excellent network loan platforms as a reference set, and the other network loan platforms as an evaluation set. The generalized DEA model is used to obtain the catch-up efficiency value of the non-excellent network loan platform, and obtain the improvement value of the non-excellent network loan platform according to the projection analysis. Finally according to the excellent platform indicators for the non-excellent platform to provide improved strategy.

3.3.1. Establishment of Generalized DEA Model

Suppose there are n decision units to be evaluated and $\overline{n}$ sample units or standards (the following collectively referred to as the sample unit), their characteristics can be represented by m kinds of inputs and s kinds of output indicators,

${x}_{p}={\left({x}_{1p},{x}_{2p},\cdots ,{x}_{mp}\right)}^{\text{T}}$ Represents the input index value of the p-th decision unit,

${y}_{p}={\left({y}_{1p},{y}_{2p},\cdots ,{y}_{sp}\right)}^{\text{T}}$ Represents the output index value of the p-th decision unit,

${\overline{x}}_{j}={\left({\overline{x}}_{1j},{\overline{x}}_{2j},\cdots ,{\overline{x}}_{mj}\right)}^{\text{T}}$ Represents the input index value of the j-th decision unit,

${\overline{y}}_{j}={\left({\overline{y}}_{1j},{\overline{y}}_{2j},\cdots ,{\overline{y}}_{sj}\right)}^{\text{T}}$ Represents the output index value of the j-th decision unit,

And they are all positive numbers. The following generalized DEA model can be constructed for the decision unit P:

$\left({\text{G-C}}^{\text{2}}\text{R}\right)\text{\hspace{0.17em}}\text{\hspace{0.17em}}\{\begin{array}{l}\mathrm{max}\text{\hspace{0.17em}}{\mu}^{\text{T}}{y}_{p}=V\left(d\right),\\ \text{s}\text{.t}.\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}{\omega}^{\text{T}}{\overline{x}}_{j}-{\mu}^{\text{T}}d{\overline{y}}_{j}\geqq 0,\text{\hspace{0.17em}}\text{\hspace{0.17em}}j=1,\cdots ,\overline{n},\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}{\omega}^{\text{T}}{\overline{x}}_{p}=1,\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\omega \geqq 0,\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mu \geqq 0.\end{array}$ (7)

where $\omega ={\left({\omega}_{1},{\omega}_{2},\cdots ,{\omega}_{m}\right)}^{\text{T}}$ is the weight of the output indicator, $\mu ={\left({\mu}_{1},{\mu}_{2},\cdots ,{\mu}_{s}\right)}^{\text{T}}$ is the weight of the output indicator, and d is a positive number, called the moving factor.

3.3.2. The Establishment of the G-CCR Model

The dual model of the model (G-C2R) can be expressed as follows:

$\left({\text{DG-C}}^{\text{2}}\text{R}\right)\text{\hspace{0.17em}}\text{\hspace{0.17em}}\{\begin{array}{l}\mathrm{min}\text{\hspace{0.17em}}\theta =D\left(d\right),\\ \text{s}\text{.t}.\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}{\displaystyle \sum _{j=1}^{\overline{n}}{\overline{x}}_{j}{\lambda}_{j}}\leqq \theta {x}_{p},\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}{\displaystyle \sum _{j=1}^{\overline{n}}d{\overline{y}}_{j}{\lambda}_{j}}\geqq {y}_{p},\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}{\lambda}_{j}\geqq 0,\text{\hspace{0.17em}}\text{\hspace{0.17em}}j=1,2,\cdots ,\overline{n}.\end{array}$ (8)

It can be proved that the (G-C2R) model has the optimal solution

3.3.3. Definition of the G-DEA Model

(1) If the optimal value $V\left(d\right)\geqq 1$ of G-C2R is planned, then called the decision-making unit p is weakly effective relative to the dice of the leading edge of the sample data, which is referred to as G-DEA (d) weakly effective (G-C2R);

(2) If the optimal value of (G-C2R) is planned as one of the following situation:

① ${\omega}^{0}>0,{\mu}^{0}>0$ , make $V\left(d\right)=1$ ;

② $V\left(d\right)>1$

It is said that the decision unit is effective for the d-times movement of the sample data’s leading edge, referred to as G-DEA (d) effective (G-C2R).

In particular, when $d=1$ , say G-DEA (1) weakly effective is G-DEA weakly effective, say G-DEA effective is G-DEA effective.

3.3.4. “Catch up with the Object” and “Catch the Object” to Establish

It can be determined that the excellent network loan platforms are pursued object through the rating results, set it as a reference set; determine the non-excellent loan platforms as chasing object, set it as an evaluation set.

3.4. The “Projection Analysis” Based on the “Catch-up Efficiency”

Definition of projection analysis:

Can be retrieved by the DEA projection formula:

${\widehat{x}}_{i}=\theta {x}_{i}-{s}_{i}^{-},{\widehat{y}}_{i}={y}_{i}+{s}_{i}^{+}$ (9)

The value of the improved object can be achieved:

$\Delta {x}_{i}={x}_{i}-{\widehat{x}}_{i}=\left(1-\theta \right){x}_{i}+{s}_{i}^{-};\Delta {y}_{i}={\widehat{y}}_{i}-{y}_{i}={s}_{i}^{+}$ (10)

4. Empirical Results and Analysis

4.1. Clustering Analysis Model

By using R-type clustering analysis to classify the indicators, the conclusion that there may be a strong correlation between some indexes which can be obtained by qualitatively examining the nine evaluation indexes of the reaction sample network loan platform. In order to verify this idea, using MATLAB software added model to calculate the correlation coefficient, and then further analysis of the problem.

4.1.1. The Standardization of Data Processing

(1) For the average

Calculate the average of the nine-indicator data for the 95 network loan platforms in February 2017, and change the standard to $\left[{x}_{1},{x}_{2},\cdots ,{x}_{9}\right]$ .

(2) Data standardization

In order to ensure the reliability of the results, it is necessary to standardize the data of each index. In practical problems, the measurement units of discrete variables are often dissimilar, in the multi-index evaluation system, the nature, dimension and magnitude of each index are often altered especially. In the case of substantial differences, it will lead to a larger index in the comprehensive analysis has a strong influence if use original data for analysis directly, while the smaller indicators of the impact are smaller. In order to eliminate the dimensional effect of the variables, so that each variable has the same expressive force, and to ensure the reliability of the results, it is necessary to standardize the data of each index. That is

${b}_{ij}=\frac{{a}_{ij}-\overline{{a}_{j}}}{{s}_{j}},\text{\hspace{0.17em}}i=1,2,\cdots ,n;\text{\hspace{0.17em}}m=95,n=9$ (11)

Among them,

$\overline{{a}_{j}}=\frac{1}{m}{\displaystyle \sum _{i=1}^{m}{a}_{ij}},\text{\hspace{0.17em}}\text{\hspace{0.17em}}{s}_{j}=\sqrt{\frac{1}{m-1}{\displaystyle \sum _{i=1}^{m}{\left({a}_{ij}-\overline{{a}_{j}}\right)}^{2}}}.$ (12)

The results of the standardized treatment are shown in Table 3 below:

Table 3. Data after the standardized results.

Data Source: Calculated by MATLAB7.11.0, compiled by the author; Note: Due to space constraints, only part of the platform is listed in Table 3.

4.1.2. Determine the Variable Similarity Measure

In the case of clustering analysis of variables, we should first establish the similarity measure of the variables, which we showed by the correlation coefficient there. The value of the variable x is determined by $\left({x}_{1j},{x}_{2j},\cdots ,{x}_{nj}\right),$ $T\in {R}^{n}$ $\left(j=1,2,\cdots ,m\right)$ . Then can use of sample correlation coefficient of the two variables x and y as its variables with similarity. That is:

${r}_{ij}=\frac{{\displaystyle \sum _{i=1}^{n}\left({x}_{ij}-\overline{{x}_{j}}\right)\left({x}_{ik}-\overline{{x}_{k}}\right)}}{{\left[{\displaystyle \sum _{i=1}^{n}{\left({x}_{ij}-\overline{{x}_{j}}\right)}^{2}}{\displaystyle \sum _{i=1}^{n}{\left({x}_{ik}-\overline{{x}_{k}}\right)}^{2}}\right]}^{\frac{1}{2}}},n=9,m=95.$ (13)

The correlation coefficient matrix is shown in Table 4 below:

4.1.3. Calculate the Similarity Measure

The similarity of the data index as follows is obtained by using the group average method:

$D\left({G}_{1},{G}_{2}\right)=\frac{1}{{n}_{1}{n}_{2}}{\displaystyle \sum _{{x}_{i}\in {G}_{1}}{\displaystyle \sum _{x{}_{j}\in {G}_{2}}d\left({x}_{i},{x}_{j}\right)}}$ (14)

It is equal to the average distance between two sample points, in the formula are the number of sample points in respectively.

Using MATLAB software, get the cluster tree is shown in Figure 1 below.

It can be seen from the cluster diagram that the four indicators have a greater correlation, including the average borrowing period (month), transaction index, popularity index and the index of divergence index. If the nine indicators are divided into three types of risk dimensions: liquidity risk, market risk and credit risk, six core indicators can be selected from nine indicators as well, and it can be used as DEA model of input and output index system for further study. The meaning of the indicator is shown in Table 5 below.

Table 4. Correlation coefficient matrix.

Data Source: Calculated by MATLAB7.11.0, compiled by the author.

Figure 1. Indicator cluster tree.

Table 5. Internet financial risk assessment core index systems.

The main source of data: Net loan platform development index rating index rules; http://bbs.wdzj.com/thread-139449-1-1.html .

4.2. The Classical DEA-CCR Model Solution

4.2.1. Calculation of Efficiency Values

Three efficiency values (comprehensive, pure technology, scale) and scale income of each P2P network loan company are obtained by the classical DEA- CCR model. The calculation results are shown in Table 6:

Table 6. P2P network loan companies is three efficiency values and scale income calculation.

All the network loan platforms Table 6 are divided into five departments according to the department, which is divided into five parts: the private sector, the banking sector, the state-owned sector, the listed company sector and the venture capital sector. According to their pure technical efficiency and scale efficiency values, making the following Figures 2-5 of which only two network loan platforms (LUp2p, LLJR) below the banking department, so this is not particularly listed.

Figure 2. Private sector in an efficient spectrum distribution chart.

Figure 3. State-owned sector in an efficient spectrum distribution chart.

Figure 4. Listed company sector in an efficient spectrum distribution chart.

4.2.2. Rating Classification

Depending on the comprehensive efficiency values in Table 5, the efficiency of the amplitude “A = the maximum value of efficiency - the minimum value of efficiency”, and A = 0.5976. The step size of the classification is “d = A/n”, according to this step, the network loan platform can be divided into n categories, for the time being referred to in this article 95 network loan platform is divided into four grades, respectively excellent, good, medium and poor, so get the boundaries of the various types were 0.5518, 0.7012, 0.8506. The percentage of frequency distribution as showed in Figure 6, the rating classification frequency distribution is generated as showed in Table 7.

4.2.3. Result Analysis

From the comprehensive level, the efficiency of the bank loan platform is the highest. The efficiency level of the venture capital sector and the state-owned capital sector is higher than that of the private sector and the listing sector.

From the above figures can be seen, the private sector network loan platform comprehensive efficiency is not high, 13.79% at an excellent level, 13.79% at a

Figure 5. Venture capital sector in an efficient spectrum distribution chart.

Figure 6. The percentage frequency chart.

Table 7. Rating classification frequency.

good level, 34.48% at the middle level, 37.93% at the poor level; The banking sector has only two network loan platforms, all at the level of excellent. The listed company sector 19.23% at the excellent, 30.77% at the good, 23.08% at the middle, 26.92% at the poor level; the state-owned network loan platform 30% at the excellent level, 40% at the good level, 15% at the middle level, 15% at the poor level; venture capital loan platform, the excellent, the good, the medium respectively 22.22%, while the poor level in 33.33%.

4.3. New DEA - Generalized DEA Model

Calculation of “Catch-up Efficiency” Based on Generalized DEA Model

Classical DEA method of reference system is an effective decision-making unit, but in fact people need to compare the object is not limited to the outstanding unit, it may be the general unit. Therefore, we use the generalized DEA method to calculate the “catch-up efficiency” value by using the excellent, good and medium network loan platforms as the reference set and the poor network loan platforms as the evaluation set, as showed in Table 8.

4.4. Based on the “Catch-up Efficiency” of the “Projection Analysis”

Projection Analysis Results

Due to the limited space, the following only takes the last 10 P2P network loan platforms of the poor set as the evaluation object and is shown in Table 9, the

Table 8. Good, medium and poor network loan platforms up to the level of “catch up efficiency” of the calculation results.

The main source of data: By the author finishing.

Table 9. The evaluation value of the hybrid evaluation unit on the effective frontier of the excellent reference set.

projection value and the improved value are discussed by using the medium, good and excellent P2P network loan platforms clusters as reference sets respectively.

4.5. The Optimal Catch-up Strategy Based on “Projection Analysis”

From the Table 10 available, the poor focus on the top ten behind the P2P network loan platforms are as follows: hexindai, eweidai, Honhe Credit Finance,

Table 10. Evaluation of the disadvantages of the site of the advantages of reference set for upgrade value.

RJS, wanglibao, goodture, ppdai, YINHU, WYJR168, and their improvement strategies to the middle, the good, the excellent enterprise clusters as a reference set are as follows:

4.5.1. Take the Excellent Network Loan Platform as a Reference Set

hexindai: the average expected rate of return decreased by 6.6563, the registered capital decreased by 54.4032 million yuan. The leverage index increased by 53.6832 and the liquidity index increased by 14.4910.

eweidai: the average expected rate of return decreased by 6.1429, registered capital decreased by 2591.9516 million yuan. The leverage index increased by 4.2620, the liquidity index increased by 30.9282, and the transparency index increased by 5.5415.

Honhe: the average expected rate of return decreased by 6.7078, registered capital decreased by 2699.7842 million. The leverage index increased by 28.8373, the liquidity index increased by 21.3040, and the transparency index increased by 3.0971.

CreditFinance: the average expected rate of return decreased by 7.1392, the registered capital decreased by 3626.8778 million yuan. The leverage index increased by 18.2110 and the liquidity index increased by 13.5705.

RJS: the average expected rate of return decreased by 6.4104, the registered capital decreased by 1590.6582 million yuan. The leverage index increased by 49.9577 and the liquidity index increased by 4.6934.

wanglibao: the average expected rate of return reduced by 5.8004, the registered capital decreased by 2660.7425 million yuan. The leverage index increased by 2.6093 and the liquidity index increased by 8.1468.

goodture: the average expected rate of return decreased by 7.5437, the registered capital decreased by 2665.6277 million yuan. The leverage index increased by 39.6025 and the liquidity index increased by 19.3419.

ppdai: the average expected rate of return decreased by 9.9610, the registered capital decreased by 55.99226 million yuan, the leverage index increased by 59.2869, the liquidity index increased by 13.5821.

YINHU: the average expected rate of return decreased by 6.7753, the registered capital decreased by 11264.0438 million, the leverage index increased by 16.5409, the liquidity index increased by 5.4076.

WYJR168: the average expected rate of return decreased by 7.7505, the registered capital decreased by 5975.6938 million yuan. The dispersion index increased by 20.4859 and the liquidity index increased by 15.6894.

4.5.2. Take the Good Network Loan Platform as a Reference Set

hexindai: the average expected rate of return decreased by 4.0941, the registered capital decreased by 3346.1246 million.

eweidai: credit index increased by 52.3450, transparency index increased by 16.4477.

Honhe: the average expected rate of return decreased by 1.9247, the registered capital decreased by 774.6795 million yuan, the liquidity index increased by 39.0142, the transparency index increased by 7.7663.

CreditFinance: the average expected rate of return decreased by 5.2694, the registered capital decreased by 3626.8778 million yuan, the liquidity index increased by 4.3450.

RJS: the average expected rate of return decreased by 3.6995, the registered capital decreased by 917.9837 million yuan. The leverage index increased by 2.6106.

wanglibao: the average expected rate of return decreased by 3.8924, registered capital decreased by 1785.4935 million. The index of dispersion increased by 4.6422 and the liquidity index increased by 2.7684.

goodture: the average expected rate of return decreased by 3.1210, the registered capital decreased by 1102.8432 million. The liquidity index increased by 31.0273 and the transparency index increased by 0.7216.

ppdai: the average expected rate of return decreased by 6.7852, the registered capital decreased by 381.455 million yuan.

YINHU: the average expected rate of return decreased by 4.9056, the registered capital decreased by 815.58680 million yuan, the liquidity index increased by 13.0792.

WYJR168: the average expected rate of return decreased by 5.5160, the registered capital decreased by 422.5286 million yuan. The dispersion index increased by 0.5867 and the liquidity index increased by 19.5509.

4.5.3. Take the Moderate Network Loan Platform as a Reference Set

hexindai: the average expected rate of return decreased by 3.1524, registered capital decreased by 2576.4847 million yuan, the liquidity index increased by 0.1367.

eweidai: credit index increased by 35.1042, the transparency index increased by 14.9267.

Honhe: the average expected rate of return decreased by 2.8418, the registered capital decreased by 114.3788 million yuan, the liquidity index increased by 6.1034, the transparency index increased by 5.0501.

CreditFinance: the average expected rate of return decreased by 3.7146, registered capital decreased by 18.8790 million yuan. The leverage index increased by 19.0412 and the liquidity index increased by 5.8967.

RJS: the leverage index increased by 24.7954, the index of dispersion increased by 44.6168, the transparency index increased by 20.8407.

wanglibao: the average expected rate of return decreased by 3.0307, registered capital decreased by 13.90233 million yuan. The leverage index increased by 1.7679.

goodture: the average expected rate of return decreased by 3.4629, registered capital decreased by 1223.6538 million.

ppdai: the average expected rate of return decreased by 5.6644, the registered capital decreased by 3184.0136 million yuan. The leverage index increased by 25.6012.

YINHU: the average expected rate of return decreased by 3.7579, registered capital decreased by 6247.5543 million. The index of dispersion increased by 4.6750 and the liquidity index increased by 8.0324.

WYJR168: the average expected rate of return decreased by 4.9851, the registered capital decreased by 3843.5256 million yuan. The dispersion index increased by 21.0267 and the liquidity index increased by 16.1895.

4.6. Summary of Improvement Measures

In the improvement strategy of poor network loan platform to excellent, good, medium network loan platform as a reference set respectively, however, the average expected yield and the registered capital should be reduced on the basis of the original amount, the leverage index, the dispersion index, the liquidity index, the transparency index and other indicators on the basis of the corresponding increase in the amount.

5. Countermeasures

The corresponding improvement measures proposed for the poor network loan platform are based on the generalized DEA results, including reduced registered capital, average expected yield and increased leverage index, liquidity index, dispersion index and transparency index, etc. In view of the connotation of these factors and the network loan platform to improve or reduce the meaning of an index, this paper gives the following recommendations:

5.1. Establish Risk Early Warning Mechanism

Put risk control in the forefront of the development process. Currently, China’s network loan platform itself is more rapid development, but there are still many problems. One of the most serious problems is the high risk of high interest rates on the network loan platform. Therefore, the network loan platform in the conduct of investment, should take full account of the safety of assets, within the scope of the law of normal business assets, Based on the investor’s own good economic level and risk resistance, to minimize the risk of capital investment. In the development process not only to take the opportunity, but also need to timely resist the risk, the risk control on the first place. While reducing risk while increasing capital liquidity while improving profitability and asset volume.

5.2. Improve Service Levels

Play the role of human capital in the network loan management platform, especially the senior management staff. Improve the level of grassroots staff. Improve service levels to improve customer experience, cultivate high-quality customer base, establish a good reputation, in order to improve the network loan platform volume. And can also be appropriate to reduce the intermediate costs, to promote the occurrence of borrowing and lending transactions, to achieve the purpose of improving the transaction index.

5.3. Increase the Proportion of Long-Term Debt

Increase the proportion of long-term liabilities in total assets. Capital leverage is equivalent to the net debt ratio refers to the ratio of long-term debt and shareholders, capital leverage is small, indicating that the debt of the company’s low degree of capitalization, long-term debt pressure is small; On the contrary, this shows that the company’s debt capitalization is high, long-term debt pressure increases. Long-term debt is relatively stable, repayment in the next few fiscal years, so the company will not face a lot of liquidity risk, the debt pressure is small at a shorter period of time. The network loan platform can use long-term debt to raise fixed assets and expand their operations. Therefore, the platform may be appropriate to increase the proportion of long-term liabilities in total assets, the use of leverage bi-directional multiplier, to achieve a small investment to get a big return.

5.4. Improve the Transparency of Funds

The network loan platform shall promptly disclose the relevant financial information to the public. Transparency is an aspect of good funding, but transparency is not an end in itself, it is a means of promoting efficiency, ensuring that regulatory organizations and network loan platforms take responsibility. Increased transparency of funds includes transparency in the system, transparency of accounting and transparency of indicators. In order to improve the transparency of the network loan platform, the network loan platform shall promptly disclose the financial information to the public, including the detailed description and necessary financial matters, including the detailed structure of the network loan platform, the functional structure of the network loan platform, clear the legal basis, and so on. The loan platform should promptly publish the financial analysis of the forecast indicators, including the financial structure and the cyclical balance, the financial sustainability (basic stable debt), the expected return period average period, etc., in order to select the most suitable loan platform for investors and borrowers, in order to achieve short-term, medium and long-term funds transparent.

6. Conclusion

This paper starts from the study of nine indicators in February, 2017, from the selection of private sector, banking sector, listed company sector, venture capital sector and state-owned loan platform. Firstly, the MATLAB model is used to classify R-type clusters according to the correlation coefficients of nine index data, and six important indexes are selected to reduce the dimension. And then we use the DEA method to establish the input and output system of the selected indicators, and use the DEA method to evaluate the comprehensive efficiency value, the pure technical efficiency value and the scale efficiency value of the network loan platform. From this, we can draw the comprehensive efficiency of the network loan platform ranked the top five were LUp2p, SOUYIDAI, Eastlending, 51jbb and HONGLING CAPITAL. Secondly, according to the definition of the step size, what will be studied by the 95 network loan platform is divided into excellent, good, medium and poor four grades, of which there are 21 excellent platforms and twenty-two good platforms; there are twenty-one medium platforms and the rest is poor. Finally, the excellent, good, and medium network loan platforms as a reference set, the last ten poor network loan platforms as an evaluation set, figure out the relative efficiency of the last ten poor loan platforms by using the generalized DEA. According to DEA projection, analysis of the worst ten network loan platform improvements for the poor network loan platform reforms direction and path to provide the best strategy, such as reducing average expected yield and registered capital, increased leverage index, liquidity index and so on.

Acknowledgements

This research was carried out with support of National Natural Science Foundation of People’s Republic of China (project 71661025 and 11602115).

Cite this paper

Lv, X. M., Zhou, L., Zhang, R., & Guo, X. N. (2017). Research on P2P Network Loan Risk Evaluation Based on Generalized DEA Model and R-Type Clustering Analysis under the Background of Big Data. Journal of Financial Risk Management, 6, 163-190. https://doi.org/10.4236/jfrm.2017.62013

References

- 1. Afsharian, M., Ahn, H., & Neumann, L. (2016). Generalized DEA: An Approach for Supporting Input/Output Factor Determination in DEA. Benchmarking: An International Journal, 23, 1892-1909. https://doi.org/10.1108/BIJ-07-2015-0074 [Paper reference 1]
- 2. Charnes, A., Cooper, W. W., & Rhodes, E. (1978). Measuring the Efficiency of Decision Making Units. European Journal of Operational Research, 2, 429-444. https://doi.org/10.1016/0377-2217(78)90138-8 [Paper reference 1]
- 3. Jain, A. K., & Dubes, R. C. (1988). Algorithms for Clustering Data. Upper Saddle River, NJ: Prentice-Hall, Inc. http://dl.acm.org/citation.cfm?id=SERIES10022.42779 [Paper reference 1]
- 4. Lin, X. D. (2015). Research on Credit Risk Measurement of Loan Platform in P2P Network Loan in China. Thesis, Guangzhou: South China University of Technology. [Paper reference 1]
- 5. Liu, C. Y., & Lv. X. M. (2016). Research on “Catching Efficiency” of Urban Commercial Banks Based on Generalized DEA. Report of Inner Mongolia Finance Society, 2016, 11. [Paper reference 1]
- 6. Lv, X. M. (2016). Research on Evaluation of Innovation Ability of A-Share Listed Banks under the Impact of Internet Finance—Based on Panel Data Generalized DEA Model. Journal of Accounting and Economic, 30, 96-114. [Paper reference 1]
- 7. Ma, Z. X. (2012a). DEA Model with Generalized Reference Set and Its Properties. Systems Engineering and Electronics, 34, 709-714. [Paper reference 1]
- 8. Ma, Z. X. (2012b). Generalized Data Envelopment Analysis. Beijing: Science Press. [Paper reference 1]
- 9. Ma, Z. X., & Lv, X. M. (2007). Research on Sample Data Envelopment Analysis Method of Preference Cone. Journal of Systems Engineering and Electronics, 29, 1275-1282. [Paper reference 1]
- 10. Ma, Z. X., & Zhao, C. Y. (2016). Measure Method for Efficiency of Generalized DEA. Journal of Systems Engineering and Electronics, 38, 2572-2585. [Paper reference 1]
- 11. Ouyang, Z. S., & Mo, T. H. (2016). Research on Internet Financial Risk Measurement and Evaluation. Journal of Hunan University of Science and Technology (Social Science Edition), 19, 173-178. [Paper reference 1]
- 12. Sha, H. J. (2015). P2P Network Loan Platform Risk Research—“Jin Rong Yun” Collapse Case as an Example. Journal of Anhui University of Finance and Economics. [Paper reference 1]
- 13. Si, S. K., & Sun, X. J. (2011). Mathematical Modeling Algorithm and Application. Beijing: Defense Industry Press. [Paper reference 1]
- 14. Wang, L. Y., & Shi, Y. (2016). Research on Internet Financial Risk Mechanism and Risk Measurement—Taking P2P Network Loan as an Example. Journal of Southeast University (Philosophy and Social Sciences), 18, 103-112. [Paper reference 1]
- 15. Zhang, T., Ramakrishnan, R., & Livny, M (1996). BIRCH: An Efficient Data Clustering Method for Very Large Databases. ACM SIGMOD Record, 25, 103-114. https://doi.org/10.1145/235968.233324 [Paper reference 1]