When creating an asset management plan, missing data is perceived to be a huge problem, especially when the event data (breaks in water distribution pipes as an example) are not tracked. The lack of tracking makes it difficult to determine which factors are the critical ones. Many utilities lack the resources for examining buried infrastructure, so other methods of data collection are needed. The concept for this paper was to develop a means to acquire data on the assets for a condition assessment (buried pipe is not visible and in most cases, cannot really be assessed). What was found was that for buried infrastructure, much more information was known than anticipated. Knowing exact information is not really necessary. However, there was a need to track event - breaks, flooding etc.—what would indicate a “failure” . The latter would be useful for predicting future maintenance needs and the most at-risk assets.
During the period 1973 to 1985, public net investment in the United States and Japan averaged 0.3% and 5.1% of gross domestic product, while their respective growth rates of real gross domestic output per employed person were 0.6% and 3.1% per annum (OECD National Accounts and Historical Statistics). Twenty- five years ago, Aschauer [
However, at present state and local governments spend about 2.4% of their GNP on infrastructure, as compared to 3.1% in 1970 [
Asset management is a process of integrating design, construction, maintenance, rehabilitation, and renovation to maximize benefits and minimize cost. It is a plan for managing an organization’s infrastructure through a decision-making process driven by a defined standard level of service. The term asset management refers to business principles aimed at balancing risk and minimizing life-cycle costs of the physical assets, such as pipes, roads, structures and equipment. Asset management is also used as a tool to help municipalities gauge the health of infrastructure [
Asset management plays a vital role to help minimize unnecessary or misplaced spending while meeting the health and environmental needs of a community [
The reliability of the assets within the area of interest starts with the design process. Decision-making dictates how assets will be maintained and means to assure the maximum return on investments. An inventory of assets needs to be established. Depending on the accuracy wanted, the data can be gathered in many ways ranging from on-site field investigation which could take a lot of time, to using existing maps, using maps while verifying the assets using aerial photography and video, or field investigations. Through condition assessment, the probability of failure can be estimated. Assets can also fail due to exceeding its maximum capacity. Prioritizing the assets by a defined system will allow for the community to see what areas are most susceptible to vulnerability/failure, which assets need the most attention due to their condition, and where the critical assets are located in relation to major public areas (hospitals, schools, etc.) with a high population.
But what if the infrastructure data is limited, which is often the case with buried pipelines? In such cases it is difficult to analyze the condition of the system and prioritize repair and replacement dollars. The goal of this paper is to outline a means to assess the system’s condition, by evaluating what can be inferred from the known data, without the need to dig up the piping. The question is how to collect data that might be useful that does not involve destructive testing on buried infrastructure which is costly and inconvenient. The reality is that there is more data than one thinks.
The problem with condition assessments is that for many of the infrastructure assets, determining the condition is nigh on impossible. Buried infrastructure is nearly impossible to determine without unburying it. Even infrastructure that is visible may provide a false assessment. A fire hydrant is partially buried. The foundations for bridges and the base of a roadway are not visible. Stormwater pipes may be visible only at outlets. Hence many asset management programs stall when there is a need to assess the condition of the assets. Many assume that since the buried infrastructure is unseen the condition cannot be determined. But this is rarely true. There is usually some information that is known, but the certainty of this information is the challenge. Uncertainty is a concept that many people, including engineers, operations staff and administrators are uncomfortable with. However statistical analysis is the mathematical means to address this uncertainty since event with some data, there is still uncertainty.
Several statistical methods have been developed to attempt to exploit limited information: resampling (bootstrap and jack-knife methods), fuzzy set theory, interval analysis, information theory and Bayesian methods [
Fuzzy set theory/logic was proposed as a paradigm shift in logic that involves a set of rules that define boundaries, and solves problems within those boundaries [
Interval analysis is an approach to the analysis of systems when the value of the quantity measured in uncertain. Interval analysis defines the value of the quantity by specifying the interval that the value is guaranteed to fall within. The methodology provides a correct formal method for measuring the upper and lower bounds required for the worst possible case. Interval analysis is not as powerful as other statistical methods when empirical information is available. Some prior information or data to create the subjective opinion is required. Like the methods previously discussed, updating with new data is not feasible, and with limited data, the ability to refine the distribution with prior data is a desirable ability.
A solution to address at least some of these issues comes from Shannon’s Information Theory and a theorem that he proved in 1948: “The probability distribution having maximum entropy (uncertainty) over any finite range of real values is the uniform distribution over that range” [
H = − ∑ i p i ln p i (1)
By maximizing information entropy, the most conservative or broadest distribution consistent with the available information can be derived - such as the mean, variance and range [
∑ i p i = 1 (2)
and
∑ i F ( X ) p i = E ( F ( X ) ) forall functions of F (3)
The concepts of information entropy are a useful theoretical underpinning in the application of Bayesian methods, useful in many aspects of the analysis [
The selection of Bayesian methods assumes that the absolute or unconditional probability density function p(x) on X is the underlying distribution found through curve-fitting. Its form, as defined by Aitcheson and Dunmore [
p ( x ) = ∫ p ( x | θ ) p ( θ ) d θ (4)
where p(x) and p(θ) are completely different, and independent functions and the function p(θ) is the prior distribution. The Bayesian approach is to assume that while the true value of θ is unknown, there are probabilities that can be assigned for a series of possible values of θ [
For the purposes of infrastructure assessment, the “mean” value of θ, or E(θ), is akin to the expected age, material, soil, depth, traffic, groundwater table, or other factor that the assessor wishes to consider. And, like Bayesian statistical methods, the more data gathered for a given asset, even when unknown, the less likely any error in the estimate will frustrate the true condition. The more information, the more likely that outcome can be predicted. As a result, Bayesian methods permit the evolution of the prediction with added data―the shape of the distribution of results becomes more narrowly focused on the likely solution. Inference, in the absence of fact, is the key. The concept works for sizes of incidents, condition of infrastructure and likelihood of failure.
The only thing that is missing in gathering data is the need for an “event” or consequence. To be useful there must be some form of tracking consequences: breaks, flooding, etc. So, the agency must identify if there is data to indicate the events, such as work orders. If so, they may contain enough data needed to piece together missing variables that would be useful to add to the puzzle. Exact accuracy is not needed, but as much information as is available is helpful. An example is helpful.
Assume that there are five arbitrary levels of condition available to analyze the asset―excellent/new, good, fair, poor and failed. If there is an asset and there is no information about it, the condition could be any one of these conditions. The probability is 0.2% or 20% for each. Assume that one data point is known―that would change the analysis considerably. Or what if the data were “sort of” known―say a probability that the asset was good or fair based on some factor? Then the probability would be altered toward the good/fair condition―less so to the poor, failed or excellent. Still there is uncertainty involved. This is precisely what Bayesian statistical methods are trying to get at. The assessor has a lot more data than one thinks even though much of it may not be known with complete certainty. The uncertainty is contained in the judgment of the assessor about certain factors.
Continuing the example, most utilities have a pretty good idea about the pipe materials. Worker memory can be very useful, even if not completely accurate. In most cases the depth of pipe is fairly similar―the deviations may be known. Soil conditions may be useful―there is an indication that that aggressive soil causes more corrosion in ductile iron pipe, and most soil information is readily available even if it is less specific per pipe of valve than desired. That can then be used as a predictive tool to help identify assets that are mostly likely to become a problem. Bloetscher et al. [
Construction may have altered the soils―for example muck and rock likely were replaced during construction with good fill. Likewise, tree roots will wrap around pipes, so their presence may indicate damage to the pipe. But no one can know this with certainly without digging the pipe up, something most communities would prefer to avoid. But the presence of trees is easily noted from aerials. Roads with truck traffic create more vibrations on roads, causing rocks to move toward the pipe and joints to flex. That brings up another possible variable―the field perception―what do the field crews recall about breaks? Are there work orders? If so do they contain the data needed to piece together missing variables that would be useful to add to the puzzle? With a little research there are at least 5 variables known.
Assume there are 9 variables that are developed. Each one has an assessment of adding to excellent, good, fair or poor condition of the pipe. These probabilities are added each time to build an understanding of overall condition (see Equation (4)―this is what this equation is trying to represent).
change. This asset has a condition that is most probably good, maybe fair. It is probably not poor or excellent.
ConsequencesUltimately there is an interest to determine if these factors have an impact on a consequence. Determining that those consequences are is the issue, so one needs to know what that response is:
・ Water main breaks?
・ Sewer breaks?
・ Sanitary sewer blockages or overflows?
・ Stormwater system overflows?
・ Roadway damage?
If the break history or sewer pipe condition is known, the impact of these factors can be developed via a linear regression model. The model would be developed
C I = w 1 C 1 + w 2 C 2 + w 3 C 3 + w 4 C 4 + ⋯ + w i C i (5)
where:
・ CI = Condition index
・ w = weighting factor
・ C is condition factor
If one knows the incident, the weights can be found:
f ( x ) = c 1 x 1 + c 2 x + 2 c 3 x 3 + ⋯ + c n x n (6)
where the values of C are real numbers and
x = [ x 1 x 2 x 3 ⋯ x n ] T (7)
Are the factors line trees, materials, traffic, etc. If one assumes these constraints and linear variables in the matrices are non-negative. If there are negative values, they must be made positive as follows
x i + = { x i if x i ≥ 0 0 otherwise (8)
x i − = { − x i if x i > 0 0 otherwise (9)
Based on the conceptual understanding of the “best guess” of data on the infrastructure, the following are the steps required to obtain a condition assessment with limited data, utilizing a series of assets gleaned from utility records for a water system for example purposes:
・ Step 1: Create a table of assets (see
・ Step 2: Create columns for the variables for which there is data (
・ Step 3: Note that where there are categorical variables (type of pipe for example), these need to be converted to separate yes/no questions as mixing. Categorical and numerical variable do not provide appropriate comparisons; hence the need to alter the categorical variables to absence/presence variables. So descriptive variables like pipe material need to be converted to binary form―i.e. create a column for each material and insert a 1 or 0 for “yes” and “no” (see
・ Step 4: Summarize the statistics for the variables. Note missing data is not permitted and known conditions should be entered directly (see
・ Step 5: Develop a linear regression to determine factors associated with each and the amount of influence that each exerts. The result will yield a series of coefficients (see
・ Step 6: Identify the predictive equation. In this case it is:
Likelihood of leaks = 12.355 − 0.489 ∗ Dia + 0.008 ∗ age + 3.144 ∗ Sand + 1.151 ∗ Lowtraffic + 5.96 ∗ heavyTraf fic − 2.819 ∗ trees + 0.297 ∗ trees + 0.58 ∗ Shallow bury under 6 + 0.194 ∗ presssure + 2.34 ∗ Ductile + 5.473 ∗ GI + 3.229 ∗ PVC + 12.428 ∗ AC (10)
Asset | breaks in 10 year | Dia | Age | Sand | Clay | Low Traffic | Heavy Traffic | Trees | no trees | shallow unde 6 | deep bury | pressure | Ductile | GI | PVC | AC | HDPE |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
water main | 17 | 2 | 45 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 55 | 0 | 0 | 0 | 1 | 0 |
water main | 11 | 2 | 45 | 0 | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 55 | 0 | 0 | 0 | 1 | 0 |
water main | 12 | 2 | 45 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 55 | 0 | 0 | 0 | 1 | 0 |
water main | 10 | 2 | 45 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 55 | 0 | 0 | 0 | 1 | 0 |
water main | 2 | 4 | 50 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 55 | 1 | 0 | 0 | 0 | 0 |
water main | 3 | 6 | 60 | 0 | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 55 | 1 | 0 | 0 | 0 | 0 |
water main | 1 | 6 | 60 | 0 | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 55 | 1 | 0 | 0 | 0 | 0 |
water main | 1 | 6 | 60 | 0 | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 55 | 1 | 0 | 0 | 0 | 0 |
water main | 0 | 6 | 20 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 55 | 0 | 0 | 1 | 0 | 0 |
Variable | Observations | Obs. with missing data | Obs. without missing data | Minimum | Maximum | Mean | Std. deviation |
---|---|---|---|---|---|---|---|
Dia | 93 | 0 | 93 | 1.000 | 16.000 | 5.011 | 3.255 |
Age | 93 | 0 | 93 | 5.000 | 60.000 | 29.194 | 18.212 |
Sand | 93 | 0 | 93 | 0.000 | 1.000 | 0.946 | 0.227 |
Clay | 93 | 0 | 93 | 0.000 | 1.000 | 0.054 | 0.227 |
Low Traffic | 93 | 0 | 93 | 0.000 | 2.000 | 0.968 | 0.231 |
Heavy Traffic | 93 | 0 | 93 | 0.000 | 1.000 | 0.043 | 0.204 |
Trees | 93 | 0 | 93 | 0.000 | 1.000 | 0.903 | 0.297 |
no trees | 93 | 0 | 93 | 0.000 | 1.000 | 0.215 | 0.413 |
shallow unde 6 | 93 | 0 | 93 | 0.000 | 1.000 | 0.849 | 0.360 |
deep bury | 93 | 0 | 93 | 0.000 | 1.000 | 0.032 | 0.178 |
pressure | 93 | 0 | 93 | 55.000 | 65.000 | 55.323 | 1.616 |
Ductile | 93 | 0 | 93 | 0.000 | 1.000 | 0.419 | 0.496 |
GI | 93 | 0 | 93 | 0.000 | 1.000 | 0.054 | 0.227 |
PVC | 93 | 0 | 93 | 0.000 | 1.000 | 0.247 | 0.434 |
AC | 93 | 0 | 93 | 0.000 | 1.000 | 0.065 | 0.247 |
HDPE | 93 | 0 | 93 | 0.000 | 5.000 | 1.075 | 2.065 |
Source | Value | Standard error | t | Pr > |t| | Lower bound (95%) | Upper bound (95%) |
---|---|---|---|---|---|---|
Intercept | −12.355 | 6.805 | −1.816 | 0.073 | −25.901 | 1.190 |
Dia | −0.489 | 0.102 | −4.795 | <0.0001 | −0.692 | −0.286 |
Age | 0.008 | 0.012 | 0.725 | 0.471 | −0.015 | 0.032 |
Sand | 3.144 | 0.891 | 3.528 | 0.001 | 1.370 | 4.918 |
Clay | 0.000 | 0.000 | ||||
Low Traffic | 1.151 | 1.546 | 0.744 | 0.459 | −1.926 | 4.227 |
Heavy Traffic | 5.961 | 2.107 | 2.830 | 0.006 | 1.768 | 10.154 |
Trees | −2.819 | 1.236 | −2.280 | 0.025 | −5.280 | −0.359 |
no trees | 0.297 | 1.134 | 0.262 | 0.794 | −1.959 | 2.554 |
shallow unde 6 | 0.580 | 1.034 | 0.561 | 0.576 | −1.478 | 2.639 |
deep bury | 0.000 | 0.000 | ||||
pressure | 0.194 | 0.119 | 1.625 | 0.108 | −0.044 | 0.431 |
Ductile | 2.342 | 0.640 | 3.658 | 0.000 | 1.068 | 3.617 |
GI | 5.473 | 0.598 | 9.151 | <0.0001 | 4.282 | 6.663 |
PVC | 3.229 | 0.717 | 4.501 | <0.0001 | 1.801 | 4.657 |
AC | 12.428 | 0.757 | 16.408 | <0.0001 | 10.920 | 13.935 |
HDPE | 0.000 | 0.000 |
Note that the variables with larger exponents generally have more impact on the number of leaks (see
・ Step 7: The equation can then be used to predict the number of breaks going forward based on the information about breaks going back in time.
・ Step 8: Finally the data can be used to predict where the breaks might occur in the future based on the past (
The hope is that these correlate well. The process is not time consuming but provides useful information on the system. It needs to be kept up as things change, but exact data is not really needed and none of this requires destructive testing.
Conducting an exercise to develop the methodology was useful, but the next step was to do something to with the results. The Dania Beach, FL sewer system was used as an example given that actual data of failures (pipe breaks from sewer leak data) existed and an understanding of the system was available. The City of Dania is approximately 7.7 square-miles. The analyzed network includes approximately 1500 assets located within the public right-of-way (ROW). Two asset maps were acquired to aid in the data collection. The first map depicted the system as it existed in the early 2010s. The original intention of this map was to illustrate which pipes in the network were suspected of breakage based on a midnight monitoring exercise after sealing the system [
The original installation design records were obtained. Dates and materials were assigned for large sections of the City. Most of the pipe was vitrified clay installed in the 1960s and 1970s so an estimated install date within 5 years was assigned since the expected life of sewer pipe assets is expected to range from 80 to 100-years and ±5 years was not deemed to be significant for the purposes of this analysis. Many other indicators of failure, for example, pipe diameter, groundwater, soils, traffic, trees and pipe depth were included. A Geographic Information System (GIS) provided spatial analytics and asset data management, while Excel provided independent asset analysis and comparative analytics. Soils maps from the US Department of Agriculture and contractor and information developed by the authors was used to help with groundwater and soils data.
From the map created in
The statistical analysis tool XLStat® was utilized in the data analysis. XLStat® requires that each variable be represented as a numeral.
Asset | Given Asset ID | Material | Age | Traffic Loading | Diameter | Depth | Length | Pipe Breaks |
---|---|---|---|---|---|---|---|---|
8'' Gravity SS | 1 | 1 | 0 | 0 | 0 | 1 | 1 | 1 |
8'' Gravity SS | 2 | 1 | 0 | 0 | 0 | 1 | 1 | 1 |
8'' Gravity SS | 3 | 0 | 1 | 0 | 0 | 1 | 1 | 0 |
8'' Gravity SS | 4 | 0 | 1 | 0 | 1 | 1 | 1 | 0 |
8'' Gravity SS | 5 | 1 | 0 | 0 | 1 | 1 | 1 | 0 |
8'' Gravity SS | 6 | 0 | 1 | 0 | 1 | 1 | 1 | 0 |
8'' Gravity SS | 7 | 0 | 1 | 0 | 1 | 1 | 1 | 0 |
8'' Gravity SS | 8 | 0 | 1 | 0 | 1 | 1 | 1 | 0 |
8'' Gravity SS | 9 | 0 | 1 | 0 | 1 | 1 | 1 | 0 |
8'' Gravity SS | 10 | 0 | 1 | 0 | 1 | 1 | 1 | 0 |
8'' Gravity SS | 11 | 0 | 1 | 0 | 0 | 1 | 1 | 0 |
8'' Gravity SS | 12 | 0 | 1 | 0 | 0 | 1 | 1 | 1 |
8'' Gravity SS | 13 | 0 | 1 | 0 | 0 | 1 | 1 | 1 |
8'' Gravity SS | 14 | 0 | 1 | 0 | 0 | 1 | 1 | 0 |
Variable | Observations | Obs. with missing data | Obs. without missing data | Minimum | Maximum | Mean | Std. deviation |
---|---|---|---|---|---|---|---|
Pipe Breaks | 851 | 0 | 851 | 0 | 1 | 0.146 | 0.353 |
Material | 851 | 0 | 851 | 0 | 1 | 0.831 | 0.375 |
Age | 851 | 0 | 851 | 0 | 3 | 1.231 | 1.033 |
Traffic Loading | 851 | 0 | 851 | 0 | 2 | 0.157 | 0.491 |
Diameter | 851 | 0 | 851 | 0 | 1 | 0.114 | 0.318 |
Depth | 851 | 0 | 851 | 0 | 1 | 0.155 | 0.362 |
Length | 851 | 0 | 851 | 0 | 1 | 0.927 | 0.26 |
Variables | Material | Age | Traffic Loading | Diameter | Depth | Length | Pipe Breaks |
---|---|---|---|---|---|---|---|
Material | 1 | 0.101 | 0.119 | −0.104 | −0.603 | −0.102 | 0.035 |
Age | 0.101 | 1 | −0.019 | −0.166 | −0.08 | −0.099 | −0.102 |
Traffic Loading | 0.119 | −0.019 | 1 | 0.556 | 0.008 | −0.057 | −0.038 |
Diameter | −0.104 | −0.166 | 0.556 | 1 | 0.204 | 0.001 | 0.03 |
Depth | −0.603 | −0.08 | 0.008 | 0.204 | 1 | 0.095 | −0.011 |
Length | −0.102 | −0.099 | −0.057 | 0.001 | 0.095 | 1 | 0.116 |
Pipe Breaks | 0.035 | −0.102 | −0.038 | 0.03 | −0.011 | 0.116 | 1 |
Source | DF | Sum of squares | Mean squares | F | Pr > F |
---|---|---|---|---|---|
Model | 6 | 3.101 | 0.517 | 4.242 | 0 |
Error | 844 | 102.831 | 0.122 | ||
Corrected Total | 850 | 105.932 |
Computed against model Y = Mean (Y).
heavy traffic areas so this was also not a good identifier.
A linear regression formula was developed from the factors and the amount of influence that each exerts to yield the predictive equation. In this case it is:
Likelihood of breaks = 0.073 ∗ material + 0.089 ∗ age + 0.078 ∗ Loading + 0.066 ∗ diameter + 0.003 ∗ depth + 0.011 ∗ length (11)
This equation can then be used to predict the number of breaks using the consequences going back in time.
Many utilities have not implemented comprehensive asset management plans for their assets. In part, this is due to the belief that they cannot properly assess the assets or the cost to do so from traditional methods is too expensive or yields data of limited value. As a result, they have limited data to present to decision-makers about the condition of their assets, and the likelihood of failure, creating an atmosphere of hoping to avoid catastrophic failures. However, utilities with limited financial capability, and who might be most at risk if failure occurs, can develop an asset management program to help identify critical risks and provide data to decision-makers who need to provide the fiscal resources to properly manage and maintain a utility system.
In this exercise, an effort was made to develop a methodology to evaluate utility assets, buried and otherwise, to help identify financial resources needed to maintain a utility system. The concept was to create data on the assets for a condition assessment (buried pipe is not visible and in most cases, cannot really be
assessed). A challenge is posed with buried infrastructure since many utilities lack the resources for examining buried infrastructure, so other methods of data collection are needed. However, much more information is known about buried infrastructure than one anticipates. This permits an assessment of likelihood of condition, using the parameters of Bayesian statistical methods applied in the field.
For predictive methods to work there needs to be a measurable consequence to be useful for predicting future maintenance needs―breaks, flooding etc.―that would indicate a failure. Unfortunately, many utilities do not do this or do not collect this data from work orders (if they use work orders). The lack of tracking makes it difficult to determine which factors are the critical ones. In this project, the effort was applied to a sewer system since that system tracked pipe damage. Pipe breaks is a consequence, but the number of breaks was found to be of greater value.
Bloetscher, F., Wander, L., Smith, G. and Dogon, N. (2017) Public Infrastructure Asset Assessment with Limited Data. Open Journal of Civil Engineering, 7, 468-487. https://doi.org/10.4236/ojce.2017.73032