logo
    Supporting the construction of mystery novel knowledge graphs using BERT
    0
    Citation
    3
    Reference
    10
    Related Paper
    Abstract:
    In the Knowledge Graph Reasoning Challenge, the plot of a mystery novel is converted into a knowledge graph. Building the graph requires the work of choosing the parts to put into the knowledge graph. This research aims to automate this task. Specifically, BERT, a Natural Language Processing model, is used to classify with either summary or non-summary, with the results proposed as the parts to put into the knowledge graph. We established a binary classification model and verified its accuracy. Its was found to have an F-value of 0.59. Given the likelihood of improving the accuracy, it is thought that organizing the dataset could have a significant impact. This is to be pursued as a next step.
    Keywords:
    Knowledge graph
    Abstract Utilizing a dataset sourced from a higher education institution, this study aims to assess the efficacy of diverse machine learning algorithms in predicting student dropout and academic success. Our focus was on algorithms capable of effectively handling imbalanced data. To tackle class imbalance, we employed the SMOTE resampling technique. We applied a range of algorithms, including Decision Tree (DT), Support Vector Machine (SVM), Random Forest (RF), as well as boosting algorithms such as Gradient Boosting (GB), Extreme Gradient Boosting (XGBoost), CatBoost (CB), and Light Gradient Boosting Machine (LB). To enhance the models' performance, we conducted hyperparameter tuning using Optuna. Additionally, we employed the Isolation Forest (IF) method to identify outliers or anomalies within the dataset. Notably, our findings indicate that boosting algorithms, particularly LightGBM and CatBoost with Optuna, outperformed traditional classification methods. Our study's generalizability to other contexts is constrained due to its reliance on a single dataset, with inherent limitations. Nevertheless, this research provides valuable insights into the effectiveness of various machine learning algorithms for predicting student dropout and academic success. By benchmarking these algorithms, our project offers guidance to both researchers and practitioners in their choice of suitable approaches for similar predictive tasks.
    Boosting
    Gradient boosting
    Benchmarking
    Hyperparameter
    Statistical classification
    Citations (15)
    Machine learning is utilized to empower a program to analyze information, understand correlations and make utilization of bits of knowledge to take care of issues or potentially enhance information and for prediction. The American Heart Association Statistics 2016 Report shows that coronary illness is the main source of death for people, responsible for 1 in every 4 deaths. Machine learning algorithms play a very important role in medical area. We use machine learning to understand, predict, and prevent cardiovascular disease using numeric data. The end goal is to produce an approved machine learning application in healthcare. In an effort to refine the search for a useful and accurate method with the dataset, the results of several algorithms will be compared. The front-runners will be analyzed and used to develop a unique, higher-accuracy method. Machine learning methods inclusive of Logistic Regression, Naïve Bayes, Decision tree(CART). We use ensemble learning for better accuracy which includes algorithms like Random Forest, XGBoost, Extra trees classifier. Also, our work adds to the present literature by giving a far reaching review of machine learning algorithms on sickness prediction tasks. Our goal is to perform predictive analysis with these machine learning algorithms on heart diseases using ensembles like bagging, boosting, stacking. Machine Learning algorithms used and conclude which techniques are effective and efficient. A huge medical datasets are accessible in different data repositories which used in the real world application.
    Ensemble Learning
    Boosting
    Online machine learning
    Gradient boosting
    The fact that cardiovascular disease (CVD) is a major cause of death worldwide highlights the significance of accurate prediction for successful preventative and treatment measures. Machine learning algorithms, which use the analysis of vast patient data to reveal hidden patterns and risk variables, have recently come to light as potential techniques for CVD prediction. This study intends to analyse how machine learning (ML) techniques are used in CVD prediction and evaluate how well they perform in comparison to conventional approaches. The study makes use of a sizable patient cohort's medical history, demographic data, and clinical factors in a complete dataset. Predictive models are built using a variety of machine learning algorithms, including support vector machine (SVM), gradient boosting, K-nearest neighbours, naive Bayes classifier, and logistic regression. To find the most important factors influencing CVD risk, feature selection techniques are used. Metrics like accuracy, sensitivity, specificity, and the area under the receiver operating characteristic curve are used to assess how well the machine learning models work. The outcomes are contrasted with well-known risk prediction models and clinical guidelines in order to assess the added value of machine learning methods. This work intends to improve CVD prediction capabilities and offer useful insights for better risk assessment and management strategies by utilizing the power of machine learning.
    Boosting
    Predictive modelling
    Gradient boosting
    Statistical classification
    Recently, the use of machine learning in meteorology has increased greatly. While many machine learning methods are not new, university classes on machine learning are largely unavailable to meteorology students and are not required to become a meteorologist. The lack of formal instruction has contributed to perception that machine learning methods are 'black boxes' and thus end-users are hesitant to apply the machine learning methods in their every day workflow. To reduce the opaqueness of machine learning methods and lower hesitancy towards machine learning in meteorology, this paper provides a survey of some of the most common machine learning methods. A familiar meteorological example is used to contextualize the machine learning methods while also discussing machine learning topics using plain language. The following machine learning methods are demonstrated: linear regression; logistic regression; decision trees; random forest; gradient boosted decision trees; naive Bayes; and support vector machines. Beyond discussing the different methods, the paper also contains discussions on the general machine learning process as well as best practices to enable readers to apply machine learning to their own datasets. Furthermore, all code (in the form of Jupyter notebooks and Google Colaboratory notebooks) used to make the examples in the paper is provided in an effort to catalyse the use of machine learning in meteorology.
    Instance-based learning
    Online machine learning
    Citations (49)
    Machine Learning (ML) is a technology that can revolutionize the world. It is a technology based on AI (Artificial Intelligence) and can predict the outcomes using the previous algorithms without programming it. A subset of artificial intelligence is called machine learning (AI). A machine may automatically learn from data and get better at what it does thanks to machine learning. “If additional data can be gathered to help a machine perform better, it can learn. A developing technology called machine learning allows computers to learn from historical data. Machines can predict the outcomes by machine learning. For Nowadays machine learning is very important for us because it makes our work easy. to many companies are using machine learning in their products, like google is using google its google assistant, which takes our voice command and gives what do we want from it, and google is also using its goggle lens form which we can find anything just by clicking a picture, and Netflix is using machine learning for recommendation of any movies or series, Machine learning has a very deep effect on our life, like nowadays we are using selfdriving car’s.
    Online machine learning
    Hyper-heuristic
    Instance-based learning
    Machine learning is often perceived as a sophisticated technology accessible only by highly trained experts. This prevents many physicians and biologists from using this tool in their research. The goal of this paper is to eliminate this out-dated perception. We argue that the recent development of auto machine learning techniques enables biomedical researchers to quickly build competitive machine learning classifiers without requiring in-depth knowledge about the underlying algorithms. We study the case of predicting the risk of cardiovascular diseases. To support our claim, we compare auto machine learning techniques against a graduate student using several important metrics, including the total amounts of time required for building machine learning models and the final classification accuracies on unseen test datasets. In particular, the graduate student manually builds multiple machine learning classifiers and tunes their parameters for one month using scikit-learn library, which is a popular machine learning library to obtain ones that perform best on two given, publicly available datasets. We run an auto machine learning library called auto-sklearn on the same datasets. Our experiments find that automatic machine learning takes 1 h to produce classifiers that perform better than the ones built by the graduate student in one month. More importantly, building this classifier only requires a few lines of standard code. Our findings are expected to change the way physicians see machine learning and encourage wide adoption of Artificial Intelligence (AI) techniques in clinical domains.
    Learning classifier system
    Citations (65)
    This paper deals with the deployment and evaluation of machine learning classifiers for prediction of tuberculosis. This research paper deploys five key machine learning classifiers Naive Bayes, Support Vector Machine, Decision Tree, K Nearest Neighbors and Random Forest. It is clearly understood that Support Vector Machine provides the best accuracy 99.3 % for the prediction of Pulmonary Tuberculosis (PTB) and Extrapulmonary Tuberculosis (EPTB) when compared with all other machine learning classifiers on Tuberculosis data set. An important challenge in machine learning is to build accurate and competent machine learning classifiers. Hence Support Vector Machine is a best suited Machine Learning Classifier for prediction of the PTB and EPTB.
    Relevance vector machine