Recently, we added the Hazen percentile function to Splashback, including it as a statistic in our apps. To accompany its release, we thought we’d write a short profile on it.
The Hazen percentile is attributed to Allen Hazen [1]. He studied and worked extensively on water and sewerage infrastructure and is most known for developing the Hazen-Williams equation, which describes the flow of water in a pipe, depending on various parameters. The Hazen percentile shows up mostly in water science applications [1], and it is a implicit or in some cases, explicit standard, for water science reports throughout Australia and New Zealand.
Calculating the Hazen percentile
This description of the calculation process is largely drawn from G McBride and G Payne’s Hazen Percentile Calculator [2].
- First we take our n data points, X_{i}, with i=1, ... , n and rank them from smallest to largest. Call this new set Y_{i}, i= 1, ..., n
- The number of required data points is dependent on the percentile value that you want to calculate. So check you have a sufficient amount of points.
For p > 0.5, n\geq \frac{1}{2(1-p)}
For p\leq 0.5 , n\geq \frac{1}{2p} - If this is satisfied, then calculate the Hazen rank, which is the rank of the P^{th} percentile value where P=p × 100 : r_{Hazen} = np + 0.5
- If this is an integer, then we have Y_{r_{Hazen}} as the percentile value. Otherwise, we must interpolate the value between the two points, with our percentile value P_P (i.e. P_{24}) being given by P_P = (1-r_f)Y_{r_i}+r_fY_{r_i+1} where r_i = integer part of r_{Hazen} and r_f = fractional part of r_{Hazen}[2]
With the free Splashback add-in for Excel, you can calculate a Hazen percentile in just a few clicks:

History
The key way Hazen’s method differs from other percentile calculation methods, is the formula with which the ranks are calculated [2]. Hazen proposed his ranking formula in his 1914 paper Storage to be Provided in Imponding Resevoirs for Municipal Water Supply. He stated this formula when commenting on how to place points for a graph of required water storage vs proportion of the years between 1897 and 1911.
The position for plotting results can be obtained with sufficient accuracy with a 10-in slide rule. The decimal position of the mth term in the series of n terms is found to be P = \frac{2m-1}{2 n}
Allen Hazen, Storage to be Provided in Imponding Reservoirs for Municipal Water Supply [3]
So the x coordinate of the m^{th} value is given by P.
Rearranging this so rank (usually r, in this case, m) is the subject:
P=\frac{(2m-1)}{2n}\\ \rArr 2nP+1=2m\\ \rArr m=\frac{(2nP+1)}{2}\\ \rArr m=nP+\frac{1}{2}\\
Finally, replacing m with r and P with p, we get:
r=np+\frac{1}{2}
In this paper, Hazen does also make notes on interpolation between the points, however, his interpolation advice is not a part of what we know to be the Hazen percentile calculation method.[3]
Hazen percentile was added to Splashback as the result of a user request. We love suggestions and feedback, and appreciate your help in shaping this platform.
If you have any ideas for functions or features then please let us know and we’ll see what we can do!
References
[1] R.J Hyndman and Y. Fan, “Sample Quantiles in Statistical Packages”, The American Statistician, vol. 50, no. 4, pp.361-365, 1996
[2] G McBride and G Payne, Hazen Percentile Calculator, NIWA, May 2009, Accessed on: May 13, 2021. [Online]. Available: https://environment.govt.nz/assets/Publications/Files/hazen-percentile-calculator-2-3_0.xls
[3] A. Hazen, “Storage to be Provided in Impounding Reservoirs for Municipal Water Supply”, American Society of Civil Engineers Transactions, Vol. 77, no. 1308, pp. 1539 – 1638, 1914