More AI Concerns

Last year, the November blog mentioned some of the challenges with Generative Artificial Intelligence (genAI).  The tools that are becoming available still need to learn from some existing material.  It was mentioned that the tools can create imaginary references or have other types of “hallucinations”.    Reference 1 quote the results from a Standford study that made mistakes 75% of the time involving legal matters.  They stated: “in a task measuring the precedential relationship between two different [court] cases, most LLMs do no better than random guessing.” The contention is that the Large Language Models (LLM) are trained by fallible humans.  It further states the larger the data they have available, the more random or conjectural their answer become.  The authors argue for a formal set of rules that would be employed by the developers of the tools.

Reference 2, states that one must understand the limitations of AI and its potential faults.  Basically the guidance is to not only know the type of answer you ae expecting, but to also evaluate obtaining the answer through a similar but different approach, or to use a competing tool to verify the potential accuracy of the initial answer provided.  From Reference 1, organizations need to beware of the limits of LLM with respect to hallucination, accuracy, explainability, reliability, and efficiency.  What was not stated is the specific question needs to carefully drafted to focus on the type of solution desired.

Reference 3 addresses the data requirement.  Depending on the type of data, structured or unstructured, depends on how the information.   The reference also employes the term derived data, which is data that is developed from elsewhere and formulated into the desired structure/answers. The data needs to be organized (formed) into a useful structure for the program to use it efficiently.  Since the application of AI within an organization, the growth can and probably will be rapid.  In order to manage the potential failures, the suggestion is to employ a modular structure to enable isolating potential areas of issues that can be more easily address in a modular structure.   

Reference 4 warns of the potential of “data poisoning”.  “Data Poisoning” is the term employed when incorrect of misleading information is incorporated into the model’s training.  This is a potential due to the large amounts of data that are incorporated into the training of a model.   The base of this concern is that many models are trained on open-web information.  It is difficult to spot malicious data when the sources are spread far and wide over the internet and can originate anywhere in the world.  There is a call for legislation to oversee the development of the models.  But, how does legislation prevent an unwanted insertion of data by an unknown programmer?  With out a verification of the accuracy of the sources of data, can it be trusted?

There are suggestions that there needs to be tools developed that can backtrack the output of the AI tool to evaluate the steps that might have been taken that could lead to errors.  The issue that becomes the limiting factor is the power consumption of the current and projected future AI computational requirements.  There is not enough power available to meet the projected needs.  If there is another layer built on top of that for checking the initial results, the power requirement increases even faster.  The systems in place can not provide the projected power demands of AI. [Ref. 5] The sources for the anticipated power have not been identified mush less have a projected data of when the power would be available.  This should produce an interesting collusion of the desire for more computer power and the ability of countries to supply the needed levels of power. 

References:

  1. https://www.computerworld.com/article/3714290/ai-hallucination-mitigation-two-brains-are-better-than-one.html
  2. https://www.pcmag.com/how-to/how-to-use-google-gemini-ai
  3. “Gen AI Insights”, InfoWorld oublicaiton, March 19, 2024
  4. “Beware of Data Poisoning”. WSJ Pg R004, March 18, 2024
  5. :The Coming Electricity Crisis:, WSJ Opinion March 29. 2024.

About Walt

I have been involved in various aspects of nanotechnology since the late 1970s. My interest in promoting nano-safety began in 2006 and produced a white paper in 2007 explaining the four pillars of nano-safety. I am a technology futurist and is currently focused on nanoelectronics, single digit nanomaterials, and 3D printing at the nanoscale. My experience includes three startups, two of which I founded, 13 years at SEMATECH, where I was a Senior Fellow of the technical staff when I left, and 12 years at General Electric with nine of them on corporate staff. I have a Ph.D. from the University of Texas at Austin, an MBA from James Madison University, and a B.S. in Physics from the Illinois Institute of Technology.
Technology

Leave a Reply