Skip to content

Solving Inverse Kinematics with Large Language Models

License

Notifications You must be signed in to change notification settings

StevenRice99/LLM-IK

Repository files navigation

Solving Inverse Kinematics with Large Language Models

This repository is for generating and testing the inverse kinematics solutions generated by large language models (LLMs) for kinematic chains with a single "end effector". This provides a framework to generate an initial text prompt from the data of a robot providing details about its joint structure, along with providing feedback prompts based off of trials to help a large language model improve.

Usage

  • main.py has all methods.
  • configuration.py controls what robot and with what settings other methods will use.
  • visualize.py lets you see and manually control a robot.
  • prompt.py will output a prompt to the console for you to give to large language models. The contents of the prompt are composed of a static start and end found in "Prompts/prompt_start.txt" and "Prompts/prompt_end_position.txt" or "Prompts/prompt_end_transform.txt" depending on what is being solved for, and between them a dynamic portion of the prompt is input consisting of information about the robot.
  • Once you have a solution from a large language model, under "Solvers", create folders matching the names of the robot under "Models" you are using. Then, create a subfolder for either "Position" or "Transform" depending on what you are solving. Then, create a Python file named after the large language model implementing the function "inverse_kinematics" which takes in a position (or position and orientation if solving for the entire transform) to reach and returns the joint values.
  • test.py lets you run trails to test the inverse kinematics and generate feedback prompts for the large language model to then use to try and improve the solution.
  • evaluate.py runs trials across all models and writes them to CSV files.

Future

  • As of now, all prompting and then executing of the code in manual via outputting prompts and results to the console, inputting them to a chat interface for a large language model such as ChatGPT, Gemini, or HuggingChat.
  • In the future, implementing methods to automatically do this via APIs such as the OpenAI API could help more quickly iterate the process of creating and debugging inverse kinematics solutions.
  • Robot arms from the Mujoco Menagerie library have been added to this repository. They have been set up for use, but no existing large language model has been able to solve them. They have been kept here for easy future experiments.