The new standard oneM2M (one machine-to-machine) aims to standardize the architecture and protocols of Internet of Things (IoT) middleware for better interoperability. Although the standard seems promising, it lacks several features for efficiently searching and retrieving IoT data which satisfy users’ intentions. In this paper, we design and develop a oneM2M-based query engine, called OMQ, that provides a real-time processing over IoT data streams. For this purpose, we define a query language which enables users to retrieve IoT data from data sources using JavaScript Object Notation (JSON). We also propose efficient query processing algorithms which utilizes the oneM2M architecture consisting of two nodes: (1) the IoT node and (2) the infrastructure node. IoT nodes of OMQ are mainly sensor devices execute user queries the aggregate, transform and filter operators, whereas the infrastructure node handles the join operator of user queries. Since the query processing algorithms are implemented as the hybrid infrastructure-edge processing, user queries can be executed efficiently in each IoT node rather than only in the infrastructure node. Thus, our OMQ system reduces the query processing time and the network bandwidth. We conducted a comprehensive evaluation of OMQ using a real and a synthetic data set. Experimental results demonstrate the feasibility and efficiency of OMQ system for executing queries and transferring data from each IoT node.
Indoor positioning techniques, owing to received signal strength indicator (RSSI)-based sensors, can provide useful trajectory-based services. These services include user movement analytics, next-to-visit recommendation, and hotspot detection. However, the value of RSSI is often disturbed due to obstacles in indoor environment, such as doors, walls, and furnitures. Therefore, many indoor positioning techniques still extract an invalid trajectory from the disturbed RSSI. An invalid trajectory contains distant or impossible consecutive positions within a short time, which is unlikely in a real-world scenario. In this study, we enhanced indoor positioning techniques with movement constraints on BLE (Bluetooth Low Energy) RSSI data to prevent an invalid semantic indoor trajectory. The movement constraints ensure that a predicted semantic position cannot be far apart from the previous position. Furthermore, we can extend any indoor positioning technique using these movement constraints. We conducted comprehensive experimental studies on real BLE RSSI datasets from various indoor environment scenarios. The experimental results demonstrated that the proposed approach effectively extracts valid indoor semantic trajectories from the RSSI data.
Rekomendasi produk memiliki peran yang signifikan dalam berbagai industri, termasuk e-commerce, ritel, perhotelan, dan keuangan. Rekomendasi dapat meningkatkan kepuasan pelanggan dan penjualan dengan membantu pelanggan menemukan produk yang relevan. Pendekatan collaborative filtering digunakan dalam rekomendasi produk ini karena data yang tersedia hanya berfokus pada fitur pengguna. Pendekatan ini memanfaatkan data interaksi pengguna-produk untuk mengungkap pola dan kesamaan di antara para pengguna. Representasi graf digunakan untuk memodelkan hubungan interaksi pengguna-produk, yang memungkinkan pemodelan yang lebih komprehensif dari ketergantungan dan hubungan antara pengguna dan produk. Penelitian ini menggunakan GCN dalam kombinasi dengan Factorization machine (FM) untuk meningkatkan personalisasi rekomendasi. GCN menggunakan konvolusi graf untuk menyebarkan dan memperbarui node embedding berdasarkan hubungan ketetanggaan mereka. GCN memanfaatkan informasi lingkungan sekitar dan struktur graf yang lebih luas, untuk meningkatkan pemahaman tentang preferensi pengguna dan menghasilkan rekomendasi yang dipersonalisasi. GCN juga dapat mengatasi keterbatasan metode lain dengan mempertimbangkan hubungan yang lebih rinci antar produk dan fitur unik dari setiap produk. FM mempertimbangkan interaksi antara fitur pengguna dan fitur produk, sehingga memahami preferensi pengguna secara lebih mendalam. Diharapkan dengan mengintegrasikan kekuatan GCN dan FM, rekomendasi produk dapat memberikan pengalaman pengguna yang lebih menarik dan menyenangkan.
The decision made by machine learning is mostly based on historical data that is used to train them. It raises the awareness that discrimination in machine learning should be eliminated because it may contain societal bias. The financial industry uses credit scoring as a reference to reflect the customer risk profile. To achieve fairness in the model, this paper tries to: (1) assess bias and (2) improve fairness in machine learning models with three bias mitigation methodologies. This study depicts that there is a trade-off between improving fairness and preserving performance. Implementing post-processing methods, for example, Grid Search performs best.
In branchless banking, fraudulent transactions can be defined as an act of an agent to conduct some non-essential transactions to receive transaction's fee. This transaction is a legitimate transaction, in terms of money movement. It is hard to detect the fraudulent transaction in this manner. This study aims to learn about the characteristics of a fraudulent transaction and provide a rule for determining suspected fraudulent transaction. To reduce the size of transactions data, Leiden community detection algorithms is used. The user is clustered in a transactional graph data. Each community is then statistically processed to acquire the outlier. From the analysis it is estimated that 25% of agent has done fraudulent transactions. The fraudulent transaction characteristic was expressed in term of the average of the community. The fraudulent transactions are 185% above the average in terms of transactions value. While for the transaction frequency, the fraudulent transactions are 90% above the average frequency. The fraudulent transactions data always act as an outlier in their respective community.
An indoor semantic trajectory is a sequence of timestamped semantic positions inside a building. However, its extraction depends on the erroneous indoor positioning. The error leads to an invalid trajectory that has distant consecutive positions. This invalid trajectory may lead to an issue of the non-sensical patterns when analyzing a big semantic trajectory data. To prevent extracting invalid trajectories, we apply the movement constraints to infer only close positions to the current position. We extend the constraints to several indoor positioning techniques, such as Hidden Markov Model, K-Nearest Neighbor, or Deep Neural Network. We show that our approach can effectively extract valid indoor semantic trajectories.
Indoor location-based services have been widely investigated to take advantage of semantic trajectories for providing user oriented services in indoor environments. Although indoor semantic trajectories can provide seamless understanding to users regarding the provided location-based services, studies on the application of deep learning approaches for robust and valid semantic indoor localization are lacking. In this study, we combined a stacked denoising autoencoder and long short term memory technique with a rule-based refinement method applying a rule-based hidden Markov model (HMM) to perform robust and valid semantic trajectory extraction. In particular, our rule-based HMM approach incorporates a direct set of rules into HMM to resolve invalid movements of the extracted semantic trajectories and is extensible to various deep learning techniques. We compared the performance of our proposed approach with that of other cutting-edge deep learning approaches on two different real-world data sets. The experimental results demonstrate the feasibility of our proposed approach to produce more robust and valid semantic trajectories.
BERT and IndoBERT have achieved impressive performance in several NLP tasks. There has been several investigation on its adaption in specialized domains especially for English language. We focus on financial domain and Indonesian language, where we perform post-training on pre-trained IndoBERT for financial domain using a small scale of Indonesian financial corpus. In this paper, we construct an Indonesian self-supervised financial corpus, Indonesian financial sentiment analysis dataset, Indonesian financial topic classification dataset, and release a family of BERT models for financial NLP. We also evaluate the effectiveness of domain-specific post-training on sentiment analysis and topic classification tasks. Our findings indicate that the post-training increases the effectiveness of a language model when it is fine-tuned to domain-specific downstream tasks.