Thread of Thought (ThoT): A new Prompting approach to Complex Contexts

4 min readNov 19, 2023

“Thread of Thought Unraveling Chaotic Contexts” discusses the challenges Large Language Models (LLMs) face in chaotic contexts and introduces a new strategy named “Thread of Thought” (ThoT) to improve reasoning performance.

Whitepaper — Thread of Thought Unraveling Chaotic Contexts.

Key points of this paper include:

Context

LLMs struggle in chaotic contexts, like those involving distractors, leading to information omission.

ThoT Strategy

Inspired by human cognitive processes, ThoT methodically segments and analyzes extended contexts, selecting relevant information.

Plug & Play ThoT Prompt:

“Walk me through this context in manageable parts step by step, summarizing and analyzing as we go.”

Example Use Case:

Context:

As a content reviewer, I provide multiple retrieved passages about this question; you need to answer the question. retrieved Passage 1 is: to introduce book vending machines, … retrieved Passage 2 is: of the Carlsbad Decrees, it quickly … retrieved Passage 3 is: It is at “Kreuzstrae 12”, opposite … retrieved Passage 4 is: house in Leipzig remained there, … retrieved Passage 5 is: Delcam Delcam is a supplier of … retrieved Passage 6 is: inhabitants are called recreinos. … retrieved Passage 7 is: ‘politically unreliable’ writers like … retrieved Passage 8 is: Recheio Recheio is the biggest … retrieved Passage 9 is: via the Internet, large volumes of … retrieved Passage 10 is: without needing to pay any royalties, …

Q: Where was Reclam founded?

{Prompting Method} A:

ThoT Prompt: “Walk me through this context in manageable parts step by step, summarising and analysing as we go.”

Author compared the ThoT against the CoT prompting technique, Let see difference in detail

Model’s Response:

“According to retrieved Passage 2, Anton Philipp Reclam founded his own publishing house in Carlsbad…”
“In retrieved Passage 4, it mentions that the Reclam publishing house in Leipzig…”
“Finally, in retrieved Passage 7, it states that after the partition of Germany…”
Conclusion: “Therefore, Reclam was originally founded in Leipzig, Germany, and later moved its main office to Stuttgart.”

Prompt: “Walk me through this context in manageable parts step by step, summarizing and analyzing as we go.”

How it differs from CoT

Integration

ThoT can be integrated with various LLMs and prompting techniques as a versatile module.

Datasets Used

PopQA Dataset: A dataset designed for long-tail question answering, containing knowledge often unfamiliar to large models.
EntityQ Dataset: Similar to PopQA, this dataset features questions that require reasoning over long-tail knowledge.
Multi-Turn Conversation Response (MTCR) Dataset: Specifically collected for this study, it’s based on everyday conversations to assess the ThoT methodology.

Evaluation metric

Exact Match (EM): This metric measures the accuracy of the model’s responses by comparing them to the correct answers. A response is considered correct only if it exactly matches the correct answer.

These metrics are crucial for assessing the effectiveness of the ThoT strategy in improving the reasoning performance of Large Language Models in chaotic contexts.

Methodology evaluation — Credits to authors of the paper

As per above evaluation table ThoT improves the accuracy by 10–20%

Model used

GPT4 , GPT3.5 , Llama2–70B , LLama2–13B & LLama2–7B

Improved Performance

ThoT shows significant improvement in reasoning over other prompting techniques in experiments.

ThoT improves the EM score ( accuracy ) by 10–20% over other prompting methods.

Methodology

ThoT’s approach involves systematic segmentation, summarisation, and analysis, which aligns with human cognitive patterns.

The paper emphasises the importance of structured processing and the capability of ThoT to enhance LLM performance in handling complex contexts.

FAQ

What is the main challenge addressed by ThoT?

ThoT tackles difficulties in chaotic contexts where information is complex and disorganized.

How does ThoT differ from CoT?

ThoT is designed for chaotic contexts, while CoT follows a linear reasoning process suitable for structured contexts.

What datasets were used in testing ThoT?

PopQA, EntityQ, and a custom Multi-Turn Conversation Response (MTCR) dataset.

Did ThoT show improvement over other methods?

Yes, ThoT demonstrated marked improvement in reasoning performance compared to other methods.

Can ThoT be integrated with existing LLMs?

Yes, it’s a versatile “plug-and-play” module compatible with various LLMs.

What is the key advantage of ThoT?

Its ability to systematically segment and analyze chaotic contexts, enhancing information extraction and reasoning.