Artificial Intelligence Explainability Accountability

Feb 7, 2024

Trustworthy Logical Reasoning Large Language Models (LLMs)

Logical LLMs is a project to translate the output from large language models (LLM) into a logic-based programming language (prolog) to detect inconsistencies and hallucinations automatically . The goals of this project would be to build a user interface for users to be able to give feedback which can be incorporated into the system. The project goal is to create a trustworthy hybrid open-source LLM tool that can learn from user feedback and explain its mistakes.

Collect Hallucinations and Facts

Topics: AI/ML, data collection, logic, user interfaces
Skills: javascript, html, python, bash, git
Difficulty: Easy/Moderate
Size: Large
Mentors: Leilani H. Gilpin (and a PhD student TBD).

Specific Tasks

Run queries in an LLM API with various prompts.
Create a user interface system that collects user feedback in a web browser.
Create a pipeline for storing the user data in a common format that can be shared in our database.
Document the tool for future maintenance.

Explaining failures in autograding

The eXplainable autograder (XAutograder) is a tool for autograding student coding assignments, while providing personalized explanations or feedback. The goal of this project is to create an introductory set of coding assignment with explanations of wrong answers. This benchmark suite will be used for testing our system. The project goal is to create a dynamic autograding system that can learn from student’s code and explain their mistakes

Design introductory questions and explanations

Topics: AI/ML, AI for education, XAI (Explainable AI_
Skills: python, git
Difficulty: Moderate
Size: Large
Mentors: Leilani H. Gilpin (and a PhD student TBD).

Specific Tasks

Design 5-10 basic programming questions (aggregated from online, other courses, etc).
Create tests of correctness (unit tests), and a testing framework which can input a set of answers, and provide a final assessment
Create a set of baseline explanations for various error cases, e.g., out of bounds error, syntax error, etc.
Create a pipeline for iterating on the test cases and/or explanation feedback.
Document the tool for future maintenance.

osre24 uc AI/ML AI for education

Artificial Intelligence Explainability Accountability

Trustworthy Logical Reasoning Large Language Models (LLMs)

Collect Hallucinations and Facts

Specific Tasks

Explaining failures in autograding

Design introductory questions and explanations

Specific Tasks

Leilani H. Gilpin

Assistant Professor, UC Santa Cruz