Recently, we added the Hazen percentile function to Splashback, including it as a statistic in our apps. To accompany its release, we thought we’d write a short profile on it.

The Hazen percentile is attributed to Allen Hazen [1]. He studied and worked extensively on water and sewerage infrastructure and is most known for developing the Hazen-Williams equation, which describes the flow of water in a pipe, depending on various parameters. The Hazen percentile shows up mostly in water science applications [1], and it is a implicit or in some cases, explicit standard, for water science reports throughout Australia and New Zealand.

### Calculating the Hazen percentile

This description of the calculation process is largely drawn from G McBride and G Payne’s Hazen Percentile Calculator [2].

- First we take our n data points, X_{i}, with i=1, ... , n and rank them from smallest to largest. Call this new set Y_{i}, i= 1, ..., n
- The number of required data points is dependent on the percentile value that you want to calculate. So check you have a sufficient amount of points.

For p > 0.5, n\geq \frac{1}{2(1-p)}

For p\leq 0.5 , n\geq \frac{1}{2p} - If this is satisfied, then calculate the Hazen rank, which is the rank of the P^{th} percentile value where P=p Ã— 100 : r_{Hazen} = np + 0.5
- If this is an integer, then we have Y_{r_{Hazen}} as the percentile value. Otherwise, we must interpolate the value between the two points, with our percentile value P_P (i.e. P_{24}) being given by P_P = (1-r_f)Y_{r_i}+r_fY_{r_i+1} where r_i = integer part of r_{Hazen} and r_f = fractional part of r_{Hazen}[2]

With the **free Splashback add-in for Excel**, you can calculate a Hazen percentile in just a few clicks:

### History

The key way Hazen’s method differs from other percentile calculation methods, is the formula with which the ranks are calculated [2]. Hazen proposed his ranking formula in his 1914 paper *Storage to be Provided in Imponding Resevoirs for Municipal Water Supply*. He stated this formula when commenting on how to place points for a graph of required water storage vs proportion of the years between 1897 and 1911.

The position for plotting results can be obtained with sufficient accuracy with a 10-in slide rule. The decimal position of the mth term in the series of n terms is found to be P = \frac{2m-1}{2 n}

Allen Hazen,Storage to be Provided in Imponding Reservoirs for Municipal Water Supply[3]

So the x coordinate of the m^{th} value is given by P.

Rearranging this so rank (usually r, in this case, m) is the subject:

P=\frac{(2m-1)}{2n}\\ \rArr 2nP+1=2m\\ \rArr m=\frac{(2nP+1)}{2}\\ \rArr m=nP+\frac{1}{2}\\

Finally, replacing m with r and P with p, we get:

r=np+\frac{1}{2}

In this paper, Hazen does also make notes on interpolation between the points, however, his interpolation advice is not a part of what we know to be the Hazen percentile calculation method.[3]

Hazen percentile was added to Splashback as the result of a user request. We love suggestions and feedback, and appreciate your help in shaping this platform.

**If you have any ideas for functions or features then please let us know and we’ll see what we can do!**

### References

[1] R.J Hyndman and Y. Fan, “Sample Quantiles in Statistical Packages”, The American Statistician, vol. 50, no. 4, pp.361-365, 1996

[2] G McBride and G Payne, Hazen Percentile Calculator, NIWA, May 2009, Accessed on: May 13, 2021. [Online]. Available: https://environment.govt.nz/assets/Publications/Files/hazen-percentile-calculator-2-3_0.xls

[3] A. Hazen, “Storage to be Provided in Impounding Reservoirs for Municipal Water Supply”, American Society of Civil Engineers Transactions, Vol. 77, no. 1308, pp. 1539 – 1638, 1914