At present, robot spraying is mainly used in single color spraying, but there is still a lack of effective automatic spraying method for complex customized patterns of multiple colors. This paper presents a robot automatic spraying method for customized digital camouflage patterns. According to the relevant technical manuals, the digital camouflage patterns are designed to meet the needs, and the image processing techniques such as corner detection are used to extract the coordinate information of each color block of the digital camouflage patterns. Considering the spray deposition model of the spray gun and combining with the robot path planning technology, the spray trajectory of the spray gun is planned in each extracted digital camouflage block. The automatic spraying of customized digital camouflage patterns can be realized by importing the trajectory information of the gun into the spraying robot.
In this paper, we aim to improve abstractive dialogue summarization quality and, at the same time, enable granularity control. Our model has two primary components and stages: 1) a two-stage generation strategy that generates a preliminary summary sketch serving as the basis for the final summary. This summary sketch provides a weakly supervised signal in the form of pseudo-labeled interrogative pronoun categories and key phrases extracted using a constituency parser. 2) A simple strategy to control the granularity of the final summary, in that our model can automatically determine or control the number of generated summary sentences for a given dialogue by predicting and highlighting different text spans from the source text. Our model achieves state-of-the-art performance on the largest dialogue summarization corpus SAMSum, with as high as 50.79 in ROUGE-L score. In addition, we conduct a case study and show competitive human evaluation results and controllability to human-annotated summaries.
Hallucination is a known issue for neural abstractive summarization models. Recent work suggests that the degree of hallucination may depend on errors in the training data. In this work, we propose a new method called Contrastive Parameter Ensembling (CaPE) to use training data more effectively, utilizing variations in noise in training samples to reduce hallucination. We first select clean and noisy subsets from the training data using different automatic factual metrics. Then, we fine-tune a base summarization model, which is trained on all training samples, on the clean (noisy) subset to obtain an \textit{expert} (\textit{anti-expert}) model. Finally, we adjust the parameters of base model by the difference between parameters of the \textit{expert} and \textit{anti-expert} models, steering the base model towards the \textit{expert} model and away from the \textit{anti-expert} model. Experimental results show that CaPE improves performance across different automatic factual metrics and human evaluation, with the maximum improvement of 16.69\% and 15.78\% on summary-level dependency-arc entailment accuracy for the XSUM and CNN/DM datasets. The improvement in factual performance does not degrade the performance on other metrics of informativeness such as ROUGE.
Pre-trained language models (PLMs) have been shown effective for zero-shot (0shot) text classification. 0shot models based on natural language inference (NLI) and next sentence prediction (NSP) employ cross-encoder architecture and infer by making a forward pass through the model for each label-text pair separately. This increases the computational cost to make inferences linearly in the number of labels. In this work, we improve the efficiency of such cross-encoder-based 0shot models by restricting the number of likely labels using another fast base classifier-based conformal predictor (CP) calibrated on samples labeled by the 0shot model. Since a CP generates prediction sets with coverage guarantees, it reduces the number of target labels without excluding the most probable label based on the 0shot model. We experiment with three intent and two topic classification datasets. With a suitable CP for each dataset, we reduce the average inference time for NLI- and NSP-based models by 25.6% and 22.2% respectively, without dropping performance below the predefined error rate of 1%.
Prompt tuning has been proven to be successful on various tasks by incorporating a small number of trainable parameters while freezing large pre-trained language models (PLMs). However, it is still unsettled how to generate more proper prompts for any individual examples and how to extend prompt tuning to multi-task learning scenarios by leveraging cross-task features. To address these challenges, we propose a token-wise prompt tuning (TPT), in which a bank of finer-grained soft prompt tokens is built for multi-task learning by memory network. The tokens are retrieved from the bank against an input example and assembled to an instance-dependent prompt. Extensive experimental results on 14 datasets demonstrated that the models enhanced by our TPT performed far better than full parameter fine-tuned models and achieved state-of-the-art by tuning only 0.035% parameters.
In recent years, globalization has highlighted the importance of having machines that can truly provide customized communication for different languages. Majority of the research in the field focus on developing technologies for widely used languages such as English. In this study, we apply HMM-based speech synthesis (HTS) technology for Indonesian language. The proposed hybrid HTS-based framework, PFHTS-IDSS, uses phoneme and full-context lab to synthesize Indonesian with higher accuracy. First, we identify a list of Indonesian phonemes according to the initial-final structure of Chinese language. Based on this, we add zero-initials that match the Indonesian acoustic performance and HTS, which can make the synthesized speech natural and smooth. Second, we consider Indonesian phonemes as synthetic units to synthesize speech through the triphone and full-context lab. In addition, we design context properties of the full-context lab and the corresponding question set to train the acoustic model, which can eliminate machine sounds. Experimental results suggest that the accuracy of phoneme segmentation (PSA) and the naturalness of speech synthesis (SSN) are significantly improved via PFHTS-IDSS. Especially, the PSA of selecting phonemes as synthetic units reaches 88.3% and the corresponding SSN based on full-context lab is 4.1. The results demonstrated by PFHTS-IDSS presented in this paper may be used in multilingual free interactive system to promote better communication in terms of voice navigation, intelligent speaker and question-answering system.
When new data streaming arrives, traditional hashing methods should retrain the hashing functions based on all samples. That leads to high training time complexity. In contrast, the online hashing algorithm re-computes the hashing functions just based on the new arrival streaming data and has been widely applied in large-scale image retrieval tasks. However, differences exist in numbers and labels between new arrival and old datasets, which causes the data imbalance problem while establishing their similarity matrix. This paper proposes a novel supervised online hashing method, Label Projection, based on Hadamard Codes for Online Hashing (LHOH), which jointly employs the label projection and similarity preservation mechanism to solve the data imbalance problem. In addition, LHOH considers the Hadamard codes as the label projection target domains to avoid the problem of difficult discrete optimization of the objective function. Then, LHOH employs the label projection matrix as label weight values, which can solve the data imbalance problem while computing the similarity matrix between new arrival and old datasets and preserve the consistency of Hamming and semantic space similarity. To increase the distinguishability among the hash codes, LHOH designs triple supervision learning mechanisms, including assigning Hadamard codes, projecting labels, and embedding labels. To validate the performance of the proposed LHOH method, this paper sets up the approximate nearest neighbor (ANN) search comparative experiments on two widely used datasets. The final results show that LHOH outperforms six current state-of-the-art online methods.