Quote:
Originally Posted by pmax
(Post 10832470)
Yeah, I skimmed through the Stanford study and that's part of the adjustment methodology, looks like they are well aware of these discrepancies hence the 2x range. Still better than the 10x 10000-100000 models !
|
No.
They had three problems.
1.
Test accuracy. The manufacturer specs said 100% sensitivity (of positive samples, 0% are reported as false negative) and 98% specificity (of negative samples, 2% are reported as false positive). The manufacturer is a little Chinese company, the Chinese FDA didn't approve their test, China has now forbidden the export of the test, the specs are based on only 375-ish samples, so how much do you trust the specs? So Stanford tried to revalidate by running the test on 30 known positives and 30 known negatives. They got only 68% sensitivity (32% false negatives) and 100% specificity (0% false positives). But 60 samples is nowhere near enough to validate a test, especially if you don't have confidence in the manufacturing consistency etc. Well, Stanford ended up using a range of possible test accuracies ranging from 68% to 100% sensitivity and 98% to 100% specificity. That's one reason for their extremely wide range of inferred results. The problem is, the real value could easily be outside of that range entirely.
2.
Sample non-representative. They advertised for participants on Facebook and those ads got passed around on FB and Nextdoor. The people who ended up coming to be tested were
heavily skewed to female, white, living around Palo Alto, and FB users. Which is very different from Santa Clara's demographics. They tried to adjust, by, for example, taking the results in a zip code and scaling up/down to match the % of population in that zip code. They didn't adjust by age, income or other important criteria, don't know why. They didn't adjust by symptom history. Their adjustment took the raw rate from 1.5% to the inferred result of 2.5-4%. When your adjustment results in a 2-3X increase in the result, that tells you your metholodology is a problem. The "result" is mostly reflecting your adjustment decisions.
3.
Sample self-selection bias. Who would be most likely to drive to the test location, especially if they live on the other side of Santa Clara county? Maybe people who had reason to think they'd been sick and wanted to find out? Self selection is a big problem in polling and it probably was here too.
This study will not pass peer review - it is getting ripped apart by peers - but the Stanford docs quickly published an Op-Ed touting the study, trying to get the most publicity before their study is retracted or revised. There is really sloppy work going on, under the pressure and incentives created by covid - unfortunately this group of Stanford docs is an example.
Note Stanford has developed its own in-house antibody test, that was not used in this study. The Stanford webpage announcing the new test explicitly notes it is not the same as the test used in the Santa Clara study. Hmm.
TL-DR if your test has a 2% false positive rate, and you're testing for something present in 1-2% of the population, you're going to get junk answers.