Recent advances in generative artificial intelligence (AI), specifically large language models (LLMs), provide new possibilities for researchers to partner with AI when developing and refining psychological instruments. In this paper we demonstrate how LLMs, such as OpenAI's ChatGPT 4 model, might be used to support the development of new psychometric scales. Partnering with AI for the purpose of developing and refining instruments, however, comes with its share of potential pitfalls. We thereby discuss throughout the paper that instrument development and refinement start and end with human judgment and expertise. We open with two use-cases that describe how we used LLMs in the development and refinement of two new psychological instruments. Next, we discuss possibilities for where and how researchers can use LLMs in the process of instrument development more broadly, including considerations for maximizing the benefits of LLMs and addressing the potential hazards when working with LLMs. Finally, we close by offering initial suggestions for psychology researchers interested in partnering with LLMs in this capacity.