Large Language Models (LLM) that utilize self attention mechanisms are challenged by the context window. When the sequence becomes too long, models can make mistakes with tokens that are located far apart and in the middle of the text. The trick is finding the right balance between cost and decision requirements.