Oak Leaf programming interface to modern Fortran since its 2008 revision.
Write and save to the file basic.f08 the source code:
program basic use oakleaf real, dimension(5) :: data = [ 1, 2, 3, 4, 5 ] real :: mean,stderr,stdsig logical :: reliable call rmean(data,mean,stderr,stdsig,reliable) write(*,*) 'rmean:',mean,stderr,stdsig,reliable end program basic
The code is ready to be compiled
$ gfortran -I/usr/include -o basic basic.f08 -L/usr/lib -loakleaf
and subsequently launched:
$ ./basic rmean: 3.00000000 0.748169661 1.65011716 T
That's all!
The robust mean estimator is a heart of OakLeaf. rmean()
is designed as a replacement for the arithmetical mean:
its execution is significanlty slower, the robust performance great.
There are two modes of rmean()
calls: plain and weighed.
The weighted mean is an obsolete term for use of a priory
known errors of the data.
subroutine rmean(data,errors,mean,stderr,stdsig,scale,reliable,flag,verbose) subroutine rmean(data,mean,stderr,stdsig,scale,reliable,flag,verbose) real(kind), dimension(:), intent(in) :: data real(kind), dimension(:), intent(in), optional :: errors real(kind), intent(out) :: mean real(kind), intent(out), optional :: stderr, stdsig, scale logical, intent(out), optional :: reliable integer, intent(out), optional :: flag logical, intent(in), optional :: verbose end subroutine rmean On input: data - array of data values to be estimated errors (optional) - statistical errors of data (errors > 0), for weighted mean verbose (optional) - print some verbose information On output are estimated: mean - robust mean stderr (optional) - standard error stdsig (optional) - standard deviation scale (optional) - scale of function normalisation reliable (optional) - realiable results? flag (optional) - indicates reliability of results: 0 == success, results are both reliable and accurate 1 == consider a rough estimate due convergence fail, try verbose 2 == basic estimates, few datapoints available 3 == data looks nearly identical, MAD is zero 4 == improper input parameters (sizes of arrays errors, data are unequal, any element of error <= 0), try verbose 5 == failed to allocate memory, only very initial estimates of mean, stderr, stdsig are just returned The results gives the robust estimate of the true value X of the sample x falling with 68% probability into the interval mean - stderr < X < mean + stderr The distribution can be approximated by Normal distribution N(mean,stdsig). WARNING This routine gives unexpected results when setup of input errors is incorrect. It is indicated by significant violation of scale /= 1. If you encountered it, run plain rmean() instead. If errors holds some additional property or condition, use it.
This routine estimates both averadge and standard deviation of a data set on base of quantiles. The median is claimed as the average of the data (which its is direct equivalent of) Q(0.5). Standard deviation is estimated as [Q(0.75) - Q(0.25)]/2.
Quantiles are robust estimators by principle, however the results are more granulated ‒ with smaller accuracy. Two data points are the absolute minimum of data amount.
subroutine qmean(data,mean,stdsig,flag)
real(dbl), dimension(:), intent(in) :: data
real(dbl), intent(out) :: mean
real(dbl), intent(out), optional :: stdsig
integer, intent(out), optional :: flag
end subroutine qmean
On input:
data - array of data values to be estimated
On output are estimated:
mean - robust mean
stdsig (optional) - standard deviation
flag (optional) - indicates reliability of results:
0 == success, results are both reliable and precise
1 == fail due too few data (less than 3)
This routine estimates the averadge by median and the standard deviation by the median of absolute deviations (MAD).
subroutine madmed(x,t,s) subroutine madwmed(x,dx,t,s,d) real(kind), dimension(:),intent(in) :: x, dx real(kind), intent(out) :: t,s end subroutine qmean On input: x - array of data values to be estimated dx (optional) - array of the data errors On output are estimated: t - median s - standard deviation as MAD/Q50
Robust estimation of the factor lambda of Poisson distribution, it is an average number of events per a time period.
subroutine fmean(photons,photons_err,counts,counts_err, &
mean,stderr,stdsig,scale,reliable,poisson,flag,verbose)
real(kind), dimension(:), target, intent(in) :: &
photons,photons_err,counts,counts_err
real(kind), intent(out) :: mean
real(kind), intent(out), optional :: stderr,stdsig,scale
logical, intent(out), optional :: reliable, poisson
integer, intent(out), optional :: flag
logical, intent(in), optional :: verbose
This module implements a robust estimator of a ratio of events
t = photon / counts
(including estimation of its scatter) appearing as the factor
lambda parameter of Poisson distribution
Po(lambda) = lambda**k/k! * exp(-lambda), lambda = t * C
which provides zero mean for difference of N1 - N2 events
in Skellam's distribution
https://en.wikipedia.org/wiki/Skellam_distribution
The estimation is valid only for large values of lambda >> 1
in Gaussian regime when transformation
(photon - t*counts) / sqrt(photon_err**2 + t**2*count_err**2)
is valid. The approach usually requires proper estimates errors.
Both photons, counts should be non-integers > 1 as a product of
previous computations.
fmean should be called as:
call fmean(photons,photons_err,counts,counts_err,mean,stderr,stdsig,
scale,reliable,poisson,verbose)
On input:
photons - array of reference photon rates
photons_err - array of statistical errors of photons
counts - array of measured rates
counts_err - array of statistical errors of counts
verbose - print additional info
On output are estimated:
mean - mean
stderr (optional) - its standard error
stdsig (optional) - estimate of standard deviation
scale (optional) - estimate of scale
reliable - indicates reliability of results (optional)
poisson - indicates Poisson (.true.) or Normal (.false.)
distribution used for determine of results (optional)
The given results means that a true value T of the sample
photons / counts can be, with 70% probability, found in interval
t - dt < T < t + dt
The estimate is performed as maximum of entropy of Poisson density
for values of photons smaller than 'plimit' (666) or by robust estimate
of parameters of Normal distribution. Both the estimates are robust.
Poisson mean is estimated as the minimum of free energy:
https://en.wikipedia.org/wiki/Helmholtz_free_energy
The routines, described above, works with help of supporting routines described below. They are should be usefull in general.
This routine provides estimation of scale by maximising of information, or minimising of variance (dispersion): Fisher information.
The scale of dispersion of data is crutial quantity for robust estimates on base of the maximum-likelihood principle.
Iscale should be called as:
call iscale(r,s,reliable,flag,verbose)
On input:
r - array of residuals: (data-mean), (data-mean)/errors, ...
verbose (optional) - verbose prints
On output are estimated:
s - scale
reliable (optional) - indicates reliability of result
if .true., scale has been correctly
localised. if .false., s is undefined
flag (optional) - result code:
0 == success, results are both reliable and precise
1 == consider rough estimate due convergence fail, try verbose
2 == basic estimates, few datapoints available
3 == data looks nearly identical
5 == failed to allocate memory, initial estimates are returned
Note.
Scale parameter shouldn't confused with standard deviation.
Their values are identical only in case of Normal distribution.
Empirical cumulative distribution function (CDF) constructed from data.
subroutine ecdf(data,x,p)
real(dbl),dimension(:),intent(in) :: data
real(dbl),dimension(:),intent(out) :: x,p
end subroutine ecdf
On input:
data - array of data values to be estimated
On output:
x - coordinates of absicca of steps
p - step values
Estimates Q-quantiles by a linear interpolation of empirical CDF between discrete CDF steps. This is the inverse function of the empirical CDF.
function quantile(q,x,y) result(t)
real(dbl), intent(in) :: q
real(dbl), dimension(:), intent(in) :: x,y
end function quantile
On input:
q - quantile as fraction (0.0 <= q <= 1.0)
x,y - empirical CDF
Result:
Value of x abscicca for given quantile