Please use this identifier to cite or link to this item:
https://dair.nps.edu/handle/123456789/5268
Title: | Large Language Model (LLM) Comparison Research |
Authors: | Will Fisher |
Keywords: | Large language models productivity data analytics |
Issue Date: | 28-Aug-2024 |
Publisher: | Acquisition Research Program |
Citation: | APA |
Series/Report no.: | Acquisition Management;SYM-AM-24-174 |
Abstract: | Over the past few years, large language models (LLMs) have rapidly increased in capability, with OpenAI’s GPT-4 being the most prominent example. This case study explores two ways that GPT-4 could be used to assist research tasks: data analysis and writing executive summaries. We chose these tasks because they are common to Institute for Defense Analyses (IDA) projects and because they are often presented as tasks appropriate for LLMs. First, we used GPT-4 to conduct tasks such as data cleaning, exploration, modeling, and visualization. We compared the quality and speed to a human doing the same task. We found analysis quality was insufficient when utilizing AI alone, but improved greatly with a human partner. Using GPT-4 saved about 60% of the time on the data analysis assignment and presents an opportunity for significant cost savings in this area. Then, we used the GPT-4 to generate executive summaries (EXSUMs) for three publicly available IDA publications, and we compared these to the human-generated EXSUMs. We found that the LLM-generated EXSUMs often failed to provide appropriate context for more technical papers, but that given the speed that they are generated and their thoroughness, LLMs still present time- and cost-saving opportunities. |
Description: | SYM Presentation |
URI: | https://dair.nps.edu/handle/123456789/5268 |
Appears in Collections: | Annual Acquisition Research Symposium Proceedings & Presentations |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
SYM-AM-24-174.pdf | Presentation | 520.46 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.