Large scale distributed systems such as cloud computing applications are becoming very common. These applications come with increasing challenges on how to transfer and where to store and compute data. The most prevalent distributed file systems to deal with these challenges is the Hadoop File System (HDFS) which is a variant of the Google File System (GFS). However HDFS has two potential problems. The first one is that it depends on a single name node to manage almost all operations of every data block in the file system. As a result it can be a bottleneck resource and a single point of failure. The second potential problem with HDFS is that it depends on TCP to transfer data. As has been cited in many studies TCP takes many rounds before it can send at the full capacity of the links in the cloud. This results in low link utilization and longer download times. To overcome these problems of HDFS we present a new distributed file system. Our scheme uses a light weight front end server to connect all requests with many name nodes. This helps distribute load of a single name node to many name nodes. Our second contribution is to use an efficient protocol to send and route data. Our protocol can achieve full link utilization and hence decreased download times. Based on simulation our protocol can outperform HDFS and hence GFS.
Notice of Violation of IEEE Publication Principles
"Reversible Data Hiding and Smart Multimedia Computing Using Big Data in Remote Sensing Systems" by Rahul Malik, Aditya Khamparia, Sahil Garg, Deepak Gupta, Bong Jun Choi, M. Shamim Hossain in IEEE Access, Vol 8, August 2020, pp. 153546- 153560
After careful and considered review of the content and authorship of this paper by a duly constituted expert committee, this paper has been found to be in violation of IEEE's Publication Principles.
This paper is a duplication of the original text from the paper cited below. The original text was copied without attribution (including appropriate references to the original author(s) and/or paper title) and without permission.
"Big Data Geospatial Processing for Massive Aerial LiDAR Datasets" by David Deibe, Margarita Amor, and Ramón Doallo in Remote Sensing, MDPI, February 2020
There is a need for improvement of tools to deal with large volumes of multimedia data effectively. In particular, real-time data processing is one of the major problems for multimedia data computing in remote sensing systems. Such big data systems have to offer effective management and computational efficiency for applications in real-time. In this paper, we propose a large-scale geological processing method for aerial Light Detection and Ranging (LiDAR) clouds containing multimedia data that ensures mobility and timeliness. By utilizing Spark and Cassandra, our proposed approach can significantly reduce the execution time of the time-consuming process. We investigate fast ground-only raster generation from huge LiDAR datasets. We observed that filtered cloud data ensuing from impartial consideration of neighboring zones could lead to classification errors on the boundaries. Therefore, an integrated approach is proposed to correct these errors to improve the classification consistency, achieve faster processing time, provide automatic error correction, obtain Digital Terrain Models (DTM), and minimize user intervention. These features can provide a framework for an on-demand DTM output and scalable application services. Furthermore, the proposed approach can expect to benefit other real-time applications in LiDAR systems.
Abstract Satellite image processing is a multidomain task which involves design of image capturing, denoising, segmentation, feature extraction, feature reduction, classification, and post-processing tasks. A wide variety of satellite image processing models are proposed by researchers, and each of them have different data & process requirements. For instance, image capturing module Might obtain Images in layered form, while feature extraction module might require data in 2D or 3D forms. Moreover, performance of these models also varies due to changes in internal process parameters & dataset parameters, which limits their accuracy and scalability when applied to real-time scenarios. To reduce the probability of these limitations, a novel high-efficiency temporal engine for real-time satellite image classification using augmented incremental transfer learning is proposed & discussed in this text. The model initially captures real-time satellite data using Google’s Earth Engine and processes it using a transfer learning-based convolutional neural network (CNN) via backscatter coefficient analysis. These coefficients indicate average intensity value of Precision Image (PRI) when evaluated over a distributed target. Due to extraction of backscattering coefficients, the model is capable of representing crop images in VV (vertical transmit, vertical receive), and HV (horizontal transmit vertical receive) modes. Thereby assisting the CNN model to extract a wide variety of features from input satellite image, which classifies these datasets (original, VV, and VH) into different crop categories. The classified images are further processed via an incremental learning layer, which assists in visual identification of affected regions. Due to use of incremental learning and CNN for classification, the proposed TRSAITL model is capable of achieving an average accuracy of 97.8% for crop type & severity of damage detection, with an average PSNR of 29.6 dB for different image types. The model was tested on different regions around our local geographical area, and consistent performance was observed. This performance was also compared with various state-of-the-art approaches, and it was observed that the proposed TRSATL model has 5% better accuracy, 4.6% better precision, and 7.9% better recall when compared with them, which makes it highly useful for real-time satellite-based crop classification applications.
Contact center agents typically respond to email queries from customers by selecting predefined answer templates that relate to the questions present in the customer query. In this paper we present a technique to automatically select the answer templates corresponding to a customer query email. Given a set of query-response email pairs we find the associations between the actual questions and answers within them and use this information to map future questions to their answer templates. We evaluate the system on a small subset of the publicly available Pine-Info discussion list email archive and also on actual contact center data comprising customer queries, agent responses and templates.
This paper argues that Dynamic Coalition Peer-to-Peer (P2P) Network exists in numerous scenarios where mobile users cluster and form coalitions, and the relationship between sizes of coalitions and distances from mobile nodes to their Point of Interest (PoI) follows exponential distributions. The P2P coalition patterns of mobile users and their exponential distribution behavior can be utilized for efficient and adaptive content file download of cellular users. An adaptive protocol named COADA (COalition-aware Adaptive content DownloAd) is designed that (a) blends cellular and P2P (e.g., WiFi or Bluetooth) wireless interfaces, (b) leverages the clustering of people into P2P coalitions when moving towards PoI, and (c) utilizes exponential-coalition-size function of the Dynamic Coalition P2P Network to minimize the cellular download and meet content file download deadline. With COADA protocol, mobile nodes periodically sample the current P2P coalition size and predict the future coalition size using the exponential function. In order to decide how much file data is available in P2P coalition channels versus how much file data must be downloaded from the server over the cellular network, Online Codes techniques are used and tune cellular download timers to meet the file download deadline. The simulation results show that COADA achieves considerable performance improvements by downloading less file data from the cellular channel and more file data over the P2P coalition network while meeting the file download deadline.
Different institutions worldwide, such as economic, social and political, are relying increasingly on the communication technology to perform a variety of functions: holding remote business meetings, discussing design issues in product development, enabling consumers to remain connected with their families and children, and so on. In this environment, where geographic and temporal boundaries are shrinking rapidly, electronic communication medium are playing an important role. With recent advances in 3D sensing, computing on new hardware platforms, high bandwidth communication connectivity and 3D display technology, the vision of 3D video-teleconferencing and of tele-immersive experience has become very attractive. These advances lead to tele-immersive communication systems that enable 3D interactive experience in a virtual space consisting of objects born in physical and virtual environments. This experience is achieved by fusing real-time color plus depth video of physical scenes from multiple stereo cameras located at different geographic sites, displaying 3D reconstructions of physical and virtual objects, and performing computations to facilitate interactions between objects. While tele-immersive (TI) systems have been attracting a lot of attention these days, the advantages of enabled interactions and delivered 3D content for viewing as opposed to current 2D high definition video have not been evaluated. In this paper, we study the effectiveness of three different types of communication media on remote collaboration in order to document the pros and cons of new technologies such as TI. The three communication media include 3D video tele-immersive, 2D video Skype and face-to-face used in a collaborative environment of a remote product development scenario. Through a study done over 90 subjects, we discuss the strengths and weaknesses of different media and propose a scope for improvement in each of them.
Across the globe, people are using mobile devices due to the swift evolution of Internet facilities that opened the doors for the utilization of many online platforms mainly for social connectivity among the people and for sharing their thoughts, activities, and ideas on worldwide issues. A lot of consumers got habituated to looking for reviews and ratings of a product on related well-known websites before going to buy that product. It turned out to be a common norm to have an analysis on buying a product from the perspective of believability and security. If we can get huge, data related to products and therefore by analyzing the obtained information, we need to recognize the necessary details of the products. In such a scenario, sentimental analysis acts handy for 198recognizing or understanding the thoughts and reactions of consumers. Sometimes, sentiment analysis is also considered thought exploration or thought mining. SVD (singular value decomposition) and PCA (principal component analysis) are utilized in this chapter for the identification of sentiments or thoughts from text mining to improve accuracy and decrease the duration of implementation. The major contributions made in this analysis are as follows: the first one for the preprocessing of the information collected by using a proposed algorithm for creating a proper basis for classification of sentiments, the second is the supplementary attributes to improve the accuracy of the classification, the third is applying SVD and PCA on the obtained data, and the last one is to identify and associate the results of the performance by designing five segments based on various attributes with or without lemma. The test results demonstrate that the current approach is more reliable than other approaches and that its duration of implementation can be minimized by its current approach.
Tele-immersive 3D multi camera environments are beginning to emerge and they bring with them lots of applications as well as challenging research problems. They enable cooperative interaction between geographically distributed sites. One of the important questions that arises in the design and imple
With the advent of virtual spaces, there has been a need to integrate physical world with virtual spaces. The integration can be achieved by real-time 3D imaging using stereo cameras followed by fusion of virtual and physical space information. Systems that enable such information fusions over several geographically distributed locations are called tele-immersive and should be easily deployed. The optimal placement of 3D cameras becomes the key to achieving high quality 3D information about physical spaces. In this paper, we present an optimization framework for automating the placement of multiple stereo cameras in an application specific manner. The framework eliminates ad-hoc experimentations and sub-optimal camera placements for end applications by running our simulation code. The camera placement problem is formulated as optimization problem over continuous physical space with the objective function based on 3D information error and a set of constraints that generalize application specific requirements. The novelty of our work lies in developing the theoretical optimization framework under spatially varying resolution requirements and in demonstrating improved camera placements with our framework in comparison with other placement techniques.