Skip to Main Content

Generative AI: An Introduction

This guide provides an introduction to generative AI.

Distortions

Generative AI is built on large language models (LLM) that are "trained" on enormous amounts of data.  These sources may reflect biases and stereotypes. When generative AI uses these resources to train the model, its output may reflect misleading or inaccurate information.  In addition, the Internet serves as a major source of data for LLM training and is overrepresented with English language resources,  which may lead generative AI models to underrepresent knowledge and ideas from non-English language communities and nations. 

While the content generative by AI can be useful, users should be aware of the possibility of bias in the output.

Hallucinations and Disinformation

Hallucinations occur when generative AI tools and applications produce incorrect or misleading content. The fabricated content is presented in such a way as to appear authentic and can be difficult to identify the errors.  One common hallucination is when generative AI is prompted for research citations in a given area; the citations may or may not point to actual literature. 

Image and sound-based AI tools are also subject to hallucinations. Generative AI may add pixels in ways that do not accurately reflect the object.  This is why, for example, image-generation tools are notorious for adding fingers to hands! 

Another significant issue over the past two years has been the deliberate misrepresentation of images, audio, and text.  

Privacy and Data Extraction

Content and prompts submitted into a generative AI tool may be shared with the tool's training database and, potentially, with other users.  The user agreements often give the company the right to collect information on users and their interactions with the application.  In addition, these privacy and usage agreements can be updated at any time. 

Some questions to keep in mind when using generative AI applications:  Are you uploading sensitive or confidential information? Would disclosure of this information violate any federal, state, or university laws or policies? Would disclosure of this information prohibit future uses you may have for this information (such as publishing an article or filing a patent)?

 

Environmental Footprint

Generative AI systems rely on a vast infrastructure that require a tremendous amount of resources, including energy and water. For this reason, the environmental footprint of generative AI is often cited as one of the ethical concerns associated with its development and use.