public class PhiAccrualFailureDetector extends java.lang.Object implements FailureDetector
The suspicion level of failure is given by a value called φ (phi). The basic idea of the φ failure detector is to express the value of φ on a scale that is dynamically adjusted to reflect current network conditions. A configurable threshold is used to decide if φ is considered to be a failure.
The value of φ is calculated as:
φ = -log10(1 - F(timeSinceLastHeartbeat)
where F is the cumulative distribution function of a normal distribution with mean
and standard deviation estimated from historical heartbeat inter-arrival times.
param: threshold A low threshold is prone to generate many wrong suspicions but ensures a quick detection in the event of a real crash. Conversely, a high threshold generates fewer mistakes but needs more time to detect actual crashes
param: maxSampleSize Number of samples to use for calculation of mean and standard deviation of inter-arrival times.
param: minStdDeviation Minimum standard deviation to use for the normal distribution used when calculating phi. Too low standard deviation might result in too much sensitivity for sudden, but normal, deviations in heartbeat inter arrival times.
param: acceptableHeartbeatPause Duration corresponding to number of potentially lost/delayed heartbeats that will be accepted before considering it to be an anomaly. This margin is important to be able to survive sudden, occasional, pauses in heartbeat arrivals, due to for example garbage collect or network drop.
param: firstHeartbeatEstimate Bootstrap the stats with heartbeats that corresponds to to this duration, with a with rather high standard deviation (since environment is unknown in the beginning)
param: clock The clock, returning current time in milliseconds, but can be faked for testing purposes. It is only used for measuring intervals (duration).
FailureDetector.Clock
Constructor and Description |
---|
PhiAccrualFailureDetector(com.typesafe.config.Config config,
EventStream ev)
Constructor that reads parameters from config.
|
PhiAccrualFailureDetector(double threshold,
int maxSampleSize,
scala.concurrent.duration.FiniteDuration minStdDeviation,
scala.concurrent.duration.FiniteDuration acceptableHeartbeatPause,
scala.concurrent.duration.FiniteDuration firstHeartbeatEstimate,
FailureDetector.Clock clock) |
Modifier and Type | Method and Description |
---|---|
scala.concurrent.duration.FiniteDuration |
acceptableHeartbeatPause() |
scala.concurrent.duration.FiniteDuration |
firstHeartbeatEstimate() |
void |
heartbeat()
Notifies the FailureDetector that a heartbeat arrived from the monitored resource.
|
boolean |
isAvailable()
Returns true if the resource is considered to be up and healthy and returns false otherwise.
|
boolean |
isMonitoring()
Returns true if the failure detector has received any heartbeats and started monitoring
of the resource.
|
int |
maxSampleSize() |
scala.concurrent.duration.FiniteDuration |
minStdDeviation() |
double |
phi()
The suspicion level of the accrual failure detector.
|
double |
phi(long timeDiff,
double mean,
double stdDeviation)
Calculation of phi, derived from the Cumulative distribution function for
N(mean, stdDeviation) normal distribution, given by
1.0 / (1.0 + math.exp(-y * (1.5976 + 0.070566 * y * y)))
where y = (x - mean) / standard_deviation
This is an approximation defined in β Mathematics Handbook (Logistic approximation).
|
double |
threshold() |
public PhiAccrualFailureDetector(double threshold, int maxSampleSize, scala.concurrent.duration.FiniteDuration minStdDeviation, scala.concurrent.duration.FiniteDuration acceptableHeartbeatPause, scala.concurrent.duration.FiniteDuration firstHeartbeatEstimate, FailureDetector.Clock clock)
public PhiAccrualFailureDetector(com.typesafe.config.Config config, EventStream ev)
threshold
, max-sample-size
,
min-std-deviation
, acceptable-heartbeat-pause
and
heartbeat-interval
.config
- (undocumented)ev
- (undocumented)public double threshold()
public int maxSampleSize()
public scala.concurrent.duration.FiniteDuration minStdDeviation()
public scala.concurrent.duration.FiniteDuration acceptableHeartbeatPause()
public scala.concurrent.duration.FiniteDuration firstHeartbeatEstimate()
public boolean isAvailable()
FailureDetector
isAvailable
in interface FailureDetector
public boolean isMonitoring()
FailureDetector
isMonitoring
in interface FailureDetector
public final void heartbeat()
FailureDetector
heartbeat
in interface FailureDetector
public double phi()
If a connection does not have any records in failure detector then it is considered healthy.
public double phi(long timeDiff, double mean, double stdDeviation)
timeDiff
- (undocumented)mean
- (undocumented)stdDeviation
- (undocumented)