Current question answering systems leveraging retrieval augmented generation perform well in answering factoid questions but face challenges with non-factoid questions, particularly how-to queries requiring detailed step-by-step instructions and explanations. In this paper, we introduce Thread, a novel data organization paradigm that transforms documents into logic units based on their inter-connectivity. Extensive experiments across open-domain and industrial scenarios demonstrate that Thread outperforms existing data organization paradigms in RAG-based QA systems, significantly improving the handling of how-to questions.
Effective incident management is pivotal for the smooth operation of Microsoft cloud services. In order to expedite incident mitigation, service teams gather troubleshooting knowledge into Troubleshooting Guides (TSGs) accessible to On-Call Engineers (OCEs). While automated pipelines are enabled to resolve the most frequent and easy incidents, there still exist complex incidents that require OCEs’ intervention. In addition, TSGs are often unstructured and incomplete, which requires manual interpretation by OCEs, leading to on-call fatigue and decreased productivity, especially among new-hire OCEs. In this work, we propose Nissist which leverages unstructured TSGs and incident mitigation history to provide proactive incident mitigation suggestions, reducing human intervention. Leveraging Large Language Models (LLM), Nissist extracts knowledge from unstructured TSGs and incident mitigation history, forming a comprehensive knowledge base. Its multi-agent system design enhances proficiency in precisely discerning OCE intents, retrieving relevant information, and delivering systematic plans consecutively. Through our user experiments, we demonstrate that Nissist significantly reduce Time to Mitigate (TTM) in incident mitigation, alleviating operational burdens on OCEs and improving service reliability. Our webpage is available at https://aka.ms/nissist.
This paper introduces Code-Vision, a benchmark designed to evaluate the logical understanding and code generation capabilities of Multimodal Large Language Models (MLLMs). It challenges MLLMs to generate a correct program that fulfills specific functionality requirements based on a given flowchart, which visually represents the desired algorithm or process. Code-Vision comprises three subsets: HumanEval-V, Algorithm, and MATH, which evaluate MLLMs' coding abilities across basic programming, algorithmic, and mathematical problem-solving domains. Our experiments evaluate 12 MLLMs on Code-Vision. Experimental results demonstrate that there is a large performance difference between proprietary and open-source models. On Hard problems, GPT-4o can achieve 79.3% pass@1, but the best open-source model only achieves 15%. Further experiments reveal that Code-Vision can pose unique challenges compared to other multimodal reasoning benchmarks MMCode and MathVista. We also explore the reason for the poor performance of the open-source models. All data and codes are available at https://github.com/wanghanbinpanda/CodeVision.
Abstract A 2–26.5GHz broadband amplifier monolithic microwave integrated circuit (MMIC) based on the 0.15 μm GaN high electron mobility transistor process has been developed. The broadband amplifier adopts a two‐stage distributed amplifier cascade structure, which improves the gain of the broadband amplifier. Further, the capacitive coupling is used to improve the gain bandwidth product of the power amplifier. To improve the output return loss of the wideband amplifier, a tuning stub is introduced at the output of the circuit. The on‐wafer measurement results show that the amplifier has a gain greater than 18 dB, gain flatness less than ±1.4 dB, and saturated output power >35 dBm with a >13% power added efficiency in the whole band.