Positive Predictive Agreement
International Organization for Standardization (ISO)/International Electrotechnical Commission (IEC) Guide 51 defines risk as « the combination of the probability of occurrence of damage and the severity of such damage ». 10 To evaluate and select test methods, laboratory professionals typically compare sensitivity (ASF) and specificity (PNA), followed by a positive predictive value (PPV) and a negative predictive value (NPV), the probability that a positive or negative test result represents a truly positive or negative patient in the tested population. These measures alone do not adequately or simply predict the level of risk to the patient or the clinical costs associated with each test method. To estimate the probability of damage, we calculated the probability that a positive result is a false positive (PFP) and the probability that a negative result is a false negative (PFN). PFP is the number of false positives as a percentage of all positive results. PFP is the rest of PPV; PFP = 1 – PPV. NFP is the number of false negative results as a percentage of all negative results. PFP is the rest of the NPV; NFP = 1 –NPV. We roughly estimated the cost of the incorrect results and, from these, we projected the severity of the damage as the cost to patients and healthcare facilities. The APP and NAP are inherent in the testing procedure. The probability of true and false outcomes in clinical settings changes with the prevalence of the virus or antibody in the population being tested. « In a population with a prevalence of 5%, a test with a sensitivity of 90% and a specificity of 95% gives a positive predictive value of 49%. In other words, less than half of those who test positive will actually have antibodies. Alternatively, in a population with an antibody prevalence of more than 52%, the same test gives a positive predictive value of more than 95%, meaning that less than one in 20 people who test positive have a false-positive test result.
« 11 Effects of increasing the percentage of positive consent (APP) (sensitivity) on false outcomes with baseline prevalence and negative agreement percentage (NPA). Figure 1 shows how the increasing prevalence of truly positive samples affects PFP and NFP. The number of patients testing positive for sars-coV-2 virus or sars-coV antibody increases with prevalence. Prevalence is determined by the spread of COVID-19 in the tested population and is beyond the control of test selection and quality. The number of true positive samples increases with prevalence and true-negative samples decrease. False positive tests are part of the truly negative samples, so they also decrease. The models for all tests are similar, but not identical, because the basic values for PPA and PNA differ between test types. When prevalence increases from 2% to 20%, with constant ASF and NAP at baseline, the PFP decreases significantly, from 70.3% to 16.2% for molecular tests, from 58.0% to 10.1% for antigen tests and from 75.9% to 20.5% for antibody tests. Unlike false positives, which decrease with prevalence, false negatives increase. False negative test results are among the truly positive samples, so they are more than tenfold compared to prevalence: from 0.3% to 3.5% for molecular tests, 0.8% to 8.9% for antigenic tests and 0.7% to 7.6% for antibody tests. This dramatic increase can be masked by looking only at the net present value, which overall decreases slightly from 99.7% to 96.5%. An example is the microbiological swab of the throat, which is used in patients with sore throat.
Usually, publications that report the PPV of a throat swab report the likelihood that this bacterium is present in the throat, rather than the patient being sick with the bacteria found. If the presence of this bacterium always led to a sore throat, PPV would be very useful. However, bacteria can colonize individuals in a harmless way and never lead to infection or disease. The sore throat that occurs in these people is caused by other pathogens such as a virus. In this situation, the gold standard used in the evaluation study represents only the presence of bacteria (which could be harmless), but not a causal bacterial sore throat. It can be shown that this problem affects the positive predictive value much more than the negative predictive value. [5] To evaluate diagnostic tests where the gold standard only looks at the possible causes of the disease, one can use an extension of the predictive value, called the etiological predictive value. [6] [7] We show the value of reporting the probability of false positive outcomes, the probability of false negative outcomes, and the cost to patients and health care. These risk measures can be calculated from the risk factors for PPA and PDA in combination with estimates of prevalence, cost and number of reff (people infected with 1 positive CARRIER of SARS-CoV-2). For tests that require even higher accuracy, such as a sensitivity of 99% or a negative predictive value, extreme caution should be exercised when interpreting the experiment, although there may be only slight uncertainties in the comparator.
In these cases, a seemingly reasonable requirement of robust test performance in almost all cases leads to the rejection of even a perfect test, as the effects shown in this document are not taken into account. Stakeholders interested in very high performance (i.e. 99% sensitivity or net present value) should bear in mind that these high-performance characteristics can only be concretely demonstrated against an almost error-free comparison method. In the absence of a near-flawless comparison method, it will not be possible to validate such high test performance characteristics, and attempts to do so are likely to lead to an underestimation of test candidates` performance. The influence of an imperfect comparator on very powerful tests is analyzed quantitatively in S7 Supporting Information (« Very High Performance Tests »). The FDA`s recent guidance for laboratories and manufacturers, « FDA Policy for Diagnostic Tests for Coronavirus Disease-2019 during Public Health Emergency, » states that users should use a clinical agreement study to determine performance characteristics (sensitivity/PPA, specificity/NPA). Although the terms sensitivity/specificity are widely known and used, the terms PPA/NPA are not. PPP, percentage of positive approval. Ground Truth: the true positive or negative state of a subject in a binary classification scheme.
If a person tested has a different probability of having disease before the test than the control groups used to determine PPV and NPV, PPV and NPV are generally different from the positive and negative probabilities after the test, with PPV and NPV referring to those established by the control groups and post-test probabilities referring to that of the person being tested (estimated, by probability ratios). . . . .