Abstract
Large Language Models (LLMs) have revolutionized the Artificial Intelligence (AI) field since the launch of ChatGPT in 2022. Since then, increasingly larger models have been released such as ChatGPT-4o having over 175 billion parameters, Llama 3.1 with 405 billion parameters, and PaLM with 560 billion parameters. However, LLMs of these sizes are no longer feasible to run easily outside of the largest research labs and organizations due to the extremely large amount of GPU compute required for both training and inference. More recently, research effort has been done to create smaller LLMs which can still perform relatively well compared to much larger models. Research has also been done to apply LLMs for domain-specific use cases such as recommendation systems via prompt engineering and fine-tuning. In this paper we combine the two research fields and fine-tune two small LLMs (2 billion parameters or less) for the sequential recommendation task. We find that fine-tuned small LLMs still perform as well and can even be better than standard sequential recommendation baseline models such as GRU4Rec and SASRec, especially in the coldstart setting.