Scientific Integrity and COVID-19

I’ve written about validating scientific findings previously. With the current COVID-19 (coronavirus) situation, there have been numerous published claims of various “facts”, which are based on models. It was only a couple of months ago that the news carried projections of 8 billion people being infected and 80 million people dying. That later number was reduced to 40 million. In the US, there were projections of up to 3 million people in the United States dying, which number has been continuously reduced to possibly a maximum of 200,000, but with a possibility of the number being much lower. The current number of reported deaths is just over 60,000 as I write this blog.

The “facts” two months ago were 80 million worldwide would die, and now that number is currently at 226,882 world-wide who have died (updated April 30, 2020) with a projected total to be in the mid to upper 100,000s. This current trend indicates that the total number will be less than 1% of the original projection! What happened? There are a lot of questions that need to be answered, but that needs to be done by the developers of the model.

The issue that will be addressed below is why models and the subsequent results need to be understood in order to correctly explain what the “facts” presented actually mean.

First, the information provided as “facts” were not “facts” but projections based on someone’s model of the situation. A comment years ago by a friend, Professor Bob Shannon of Texas A&M explained it well. “All models are WRONG, some are useful!” A strong statement, which we will explore.

Why are all models wrong? The answer is that models are based on assumption. (I have spent considerable time working in modeling.) The model is only as correct as the mathematical description of the object being evaluated, the accuracy of the assumptions being made, the inclusion of all the key variables, and an estimate of the probably of the variables occurring. Usually models are built, tested, modified, tested again, and finally run multiple times over a set of probabilities. The resultant answers yield a possible projection with a probability range. There are usually results that provide the extremes as well as the most probable. Therefore, the ANSWER is not a single number but a variable with a probability range based on certain assumptions. Notice the word “assumptions”, it is plural. The results of the model are only as good as the assumptions. If you do not know the assumptions, you are unable to evaluate the results from the model. In addition, models need to be improved as the analysis continues. This is why it is called modeling. There are some suggestions that the basic virus impact model has not changed. [Ref. #1] That in itself is unusual. Models need to be continually updated to reflect learning from earlier versions.

A commentary by Holman W. Jenkins, Jr. [Ref. #2] provides thoughts about the media not being able to understand multivariate. This is the fact that things are not simple “if A, then B”. A short version of this is: “When it rains, Jack always wears a hat”. Does this imply that Jack wearing a hat causes it to rain? Of course not. But there are other variables. Does Jack need to keep his head covered due to a skin problem? Does Jack always wear a hat rain or shine? This is simple example. But, when reporting reduces a story to a single number, it loses the contributing factors. A better understanding of how models work is required to be able to accurately report on it.

Yes, there is a need to evaluate situations that may cause a singularity – also called a black swan event. At one-time, black swans were considered fiction, then people found one. It was rare at the time. Now they are not that rare. Recently, the 100-foot rogue waves were first considered fiction and then black swan events. Thanks to satellite imaging, we now know that they happen relatively often is certain part of the world with certain conditions. The point being that looking at the results of a model, one needs to consider the possibility of such event, but not use that as the final answer. Is it possible that 8 billion of the current 8.8 billion people could get the virus? A 91% world infection rate? Possibly yes, but that would not be the most probable and require a significant reevaluation of the modeling assumptions.

What we, as the public, need to hear and understand is what the assumptions were in developing the models. The first model is what is the total impact and how is it spread over time. The second model is what is being proposed and what is that impact. The concern on the current situation that required governmental interaction was the potential for a huge number of cases that would overwhelm the medical system. By using a distancing model of 6 feet and a requirement of sheltering in place, the rate of infection is slowed and occurs over a longer period so the medial facilities would not be overwhelmed. It is not indicating that the fatality rate is lower due to these regulations. It has only been delayed. If a vaccine is created, it will lower the fatality rate. To be presenting anything otherwise is an indication of not understanding what the models are saying. Large number projections may get people nervous and provide revenue for media, but the large number projections end up forcing the improper allocation of resources.

The Mayo Clinic responded to the projections that the COVID-19 would require major resource allocation. This resulted in the cessation/postponement of elective surgeries, cancer treatments, and other related medical procedures. This large projected number of serious ill people did not happen. The result was the Mayo Clinic is furloughing and/or giving pay cuts to about 1/3 of its 70,000 employees. [Ref. #3] This does not include the impact on related, externally contracted workers. And, that does not even address the impact on the patients who were unable to have their procedures.

Could all of this over allocation of resources be based on the lack of knowledge of understanding what is involved is establishing guidance based on unknowns in models? One needs to know what is involved in the assumptions, variables, and probabilities. After the models are run, there is a final decision. Does the answer make sense, or could there be elements missing or misstated? 91% of the world being infected, raises a very serious question about the validity of the model with me.

If the news media responds to analyses with single number answers, when these answers are not accurate, can there be any guaranty of developing a true understanding of the problem? I doubt it. The consequence of this type “factual presentation” is that the general public loses trust in any statements that are published. With that is also a loss of confidence in leadership. Scientific facts need to be presented accurately with the assumptions accompanying the results. Integrity in every step of the entire process is required.

References:

I do not provide an explicit email address due to that email address becoming overloaded with spam and advertising. I can be reached via email: “ideas at nano-blog dot com”. Replace the “at” and “dot” with the appropriate symbols.

Scientific Integrity and COVID-19

About Walt

1 response to Scientific Integrity and COVID-19

Leave a Reply Cancel reply

Archives

Meta