The monodisperse spherical silver particles with a diameter of 1-3 μm were successfully prepared by the chemical reaction using the precursor solution,silver nitrate and ascorbic acid, respectively. The effects of the surfactants on the morphology and size distribution were also systematically investigated, it can be found that the silver particles with high sphericity can be easily obtained by simply using the wrapping mechanism. A thick conductive film on the silicon substrate with square resistance of 0.15mΩ/cm 2 and adhesion of 16.74N was obtained by mixing the silver particles with organic carrier and commercial glass powder, followed by sintering at 850°C. It is rational to deduce that the morphology and particle size of silver particles play an important role on the performance of the conductive silver paste.
To segment 4K or 6K ultra high-resolution images needs extra computation consideration in image segmentation. Common strategies, such as downsampling, patch cropping, and cascade model, cannot address well the balance issue between accuracy and computation cost. Motivated by the fact that humans distinguish among objects continuously from coarse to precise levels, we propose the Continuous Refinement Model (CRM) for the ultra high-resolution segmentation refinement task. CRM continuously aligns the feature map with the refinement target and aggregates features to reconstruct these image details. Besides, our CRM shows its significant generalization ability to fill the resolution gap between low-resolution training images and ultra high-resolution testing ones. We present quantitative performance evaluation and visualization to show that our proposed method is fast and effective on image segmentation refinement. Code is available at https://github.com/dvlab-research/Entity/tree/main/CRM.
With the rapid expansion of digital music formats, it's indispensable to recommend users with their favorite music. For music recommendation, users' personality and emotion greatly affect their music preference, respectively in a long-term and short-term manner, while rich social media data provides effective feedback on these information. In this paper, aiming at music recommendation on social media platforms, we propose a Personality and Emotion Integrated Attentive model (PEIA), which fully utilizes social media data to comprehensively model users' long-term taste (personality) and short-term preference (emotion). Specifically, it takes full advantage of personality-oriented user features, emotion-oriented user features and music features of multi-faceted attributes. Hierarchical attention is employed to distinguish the important factors when incorporating the latent representations of users' personality and emotion. Extensive experiments on a large real-world dataset of 171,254 users demonstrate the effectiveness of our PEIA model which achieves an NDCG of 0.5369, outperforming the state-of-the-art methods. We also perform detailed parameter analysis and feature contribution analysis, which further verify our scheme and demonstrate the significance of co-modeling of user personality and emotion in music recommendation.
Dense image segmentation tasks e.g., semantic, panoptic) are useful for image editing, but existing methods can hardly generalize well in an in-the-wild setting where there are unrestricted image domains, classes, and image resolution and quality variations. Motivated by these observations, we construct a new entity segmentation dataset, with a strong focus on high-quality dense segmentation in the wild. The dataset contains images spanning diverse image domains and entities, along with plentiful high-resolution images and high-quality mask annotations for training and testing. Given the high-quality and -resolution nature of the dataset, we propose CropFormer which is designed to tackle the intractability of instance-level segmentation on high-resolution images. It improves mask prediction by fusing high-res image crops that provide more fine-grained image details and the full image. CropFormer is the first query-based Transformer architecture that can effectively fuse mask predictions from multiple image views, by learning queries that effectively associate the same entities across the full image and its crop. With CropFormer, we achieve a significant AP gain of $1.9$ on the challenging entity segmentation task. Furthermore, CropFormer consistently improves the accuracy of traditional segmentation tasks and datasets. The dataset and code will be released at http://luqi.info/entityv2.github.io/.
Dense image segmentation tasks (e.g., semantic, panoptic) are useful for image editing, but existing methods can hardly generalize well in an in-the-wild setting where there are unrestricted image domains, classes, and image resolution & quality variations. Motivated by these observations, we construct a new entity segmentation dataset, with a strong focus on high-quality dense segmentation in the wild. The dataset contains images spanning diverse image domains and entities, along with plentiful high-resolution images and high-quality mask annotations for training and testing. Given the high-quality and -resolution nature of the dataset, we propose CropFormer which is designed to tackle the intractability of instance-level segmentation on high-resolution images. It improves mask prediction by fusing high-res image crops that provides more fine-grained image details and the full image. CropFormer is the first query-based Transformer architecture that can effectively fuse mask predictions from multiple image views, by learning queries that effectively associate the same entities across the full image and its crop. With CropFormer, we achieve a significant AP gain of 1.9 on the challenging entity segmentation task. Furthermore, CropFormer consistently improves the accuracy of traditional segmentation tasks and datasets. The dataset and code are released at http://luqi.info/entityv2.github.io/.
The convolution operation suffers from a limited receptive filed, while global modeling is fundamental to dense prediction tasks, such as semantic segmentation. In this paper, we apply graph convolution into the semantic segmentation task and propose an improved Laplacian. The graph reasoning is directly performed in the original feature space organized as a spatial pyramid. Different from existing methods, our Laplacian is data-dependent and we introduce an attention diagonal matrix to learn a better distance metric. It gets rid of projecting and re-projecting processes, which makes our proposed method a light-weight module that can be easily plugged into current computer vision architectures. More importantly, performing graph reasoning directly in the feature space retains spatial relationships and makes spatial pyramid possible to explore multiple long-range contextual patterns from different scales. Experiments on Cityscapes, COCO Stuff, PASCAL Context and PASCAL VOC demonstrate the effectiveness of our proposed methods on semantic segmentation. We achieve comparable performance with advantages in computational and memory overhead.
Advances in text-based image generation and editing have revolutionized content creation, enabling users to create impressive content from imaginative text prompts. However, existing methods are not designed to work well with the oversimplified prompts that are often encountered in typical scenarios when users start their editing with only vague or abstract purposes in mind. Those scenarios demand elaborate ideation efforts from the users to bridge the gap between such vague starting points and the detailed creative ideas needed to depict the desired results. In this paper, we introduce the task of Image Editing Recommendation (IER). This task aims to automatically generate diverse creative editing instructions from an input image and a simple prompt representing the users' under-specified editing purpose. To this end, we introduce Creativity-Vision Language Assistant~(Creativity-VLA), a multimodal framework designed specifically for edit-instruction generation. We train Creativity-VLA on our edit-instruction dataset specifically curated for IER. We further enhance our model with a novel 'token-for-localization' mechanism, enabling it to support both global and local editing operations. Our experimental results demonstrate the effectiveness of \ours{} in suggesting instructions that not only contain engaging creative elements but also maintain high relevance to both the input image and the user's initial hint.
Recent studies have shown that miR-802 is abnormally expressed in many tumors. miR-802 is expressed at low levels in tissues and cells of gastric cancer, colorectal cancer, breast cancer, cervical cancer, epithelial ovarian cancer, tongue squamous cell carcinoma, oral squamous cell carcinoma, esophageal squamous cell carcinoma, laryngeal squamous cell carcinoma, and melanoma. In contrast, miR-802 is overexpressed in hepatocellular carcinoma, bladder urothelial cancer, osteosarcoma, and cholesteatoma tissue cells. It should be noted that the results of studies on the expression of miR-802 in pancreatic cancer, prostate cancer, and lung cancer are inconsistent. Current studies have found that miR-802 can target and regulate genes in different tumors, and affect the regulation of the Wnt signaling pathway, EMT signaling pathway, PI3K/AKT signaling pathway, ERK signaling pathway, and Hedgehog signaling pathway. At the same time, miR-802 is regulated by the endogenous competition of four ceRNAs, including circDONSON, IGFL2-AS1, MIR155HG, and MIR4435-2HG. This article reviews the abnormal expression of miR-802 in a variety of tumors, expounds the mechanism by which miR-802 affects tumor progression by regulating different target genes, and elaborates the network of miR-802-related ceRNAs. We also summarized the limitations of miR-802 research and looked forward to the potential application of miR-802 in the diagnosis and prognosis of tumors.
The correspondence between residual networks and dynamical systems motivates researchers to unravel the physics of ResNets with well-developed tools in numeral methods of ODE systems. The Runge-Kutta-Fehlberg method is an adaptive time stepping that renders a good trade-off between the stability and efficiency. Can we also have an adaptive time stepping for ResNets to ensure both stability and performance? In this study, we analyze the effects of time stepping on the Euler method and ResNets. We establish a stability condition for ResNets with step sizes and weight parameters, and point out the effects of step sizes on the stability and performance. Inspired by our analyses, we develop an adaptive time stepping controller that is dependent on the parameters of the current step, and aware of previous steps. The controller is jointly optimized with the network training so that variable step sizes and evolution time can be adaptively adjusted. We conduct experiments on ImageNet and CIFAR to demonstrate the effectiveness. It is shown that our proposed method is able to improve both stability and accuracy without introducing additional overhead in inference phase.
Neural Radiance Fields (NeRF) has been wildly applied to various tasks for its high-quality representation of 3D scenes. It takes long per-scene training time and per-image testing time. In this paper, we present EfficientNeRF as an efficient NeRF-based method to represent 3D scene and synthesize novel-view images. Although several ways exist to accelerate the training or testing process, it is still difficult to much reduce time for both phases simultaneously. We analyze the density and weight distribution of the sampled points then propose valid and pivotal sampling at the coarse and fine stage, respectively, to significantly improve sampling efficiency. In addition, we design a novel data structure to cache the whole scene during testing to accelerate the rendering speed. Overall, our method can reduce over 88% of training time, reach rendering speed of over 200 FPS, while still achieving competitive accuracy. Experiments prove that our method promotes the practicality of NeRF in the real world and enables many applications. The code is available in https://github.com/dvlabresearch/EfficientNeRF.