Abstract Background Free-text notes in disease intervention specialist (DIS) records may contain relevant information for STI control. In their current form, the notes are not analyzable without manual reading, which is labor-intensive and prone to error. Methods We used natural language processing (NLP) methods to analyze 2019 Ohio DIS syphilis records with non-missing notes (n = 1,987). We identified 21 topics relevant for transmission and case investigations. We manually coded these records to create “gold standard” labels for each topic (0 = topic not present, 1 = topic present), then trained machine learning models to identify the topics in the text. For models to analyze text data, the text must be converted to numbers. We explored two approaches to numerically represent words: (1) term frequency, inverse document frequency (TF-IDF), which measures importance of words based on how many times they appear in a record and in the dataset as a whole, and (2) GloVe embeddings, which are numerical vectors that were developed by researchers for each word in the English language to encode its semantic meaning. We explored three types of statistical models (naïve Bayes, support vector machine [SVM], and logistic regression) using TF-IDF, and one type of neural network model (long short-term memory [LSTM] model) using GloVe. All models were used for binary prediction (i.e., topic not present, topic present). Results For most topics, the LSTM model performed the best overall in identifying topics, and the SVM model performed the best among the statistical models. For example, the LSTM model predicted the topic “substance use” with high accuracy (97%), sensitivity (92%), and specificity (98%). No model performed well for uncommon topics (e.g., “alcohol use” or “delays in care”). Conclusions Machine learning models performed well in identifying some topics in 2019 Ohio syphilis records. This analysis is a first step in applying NLP methods to making DIS notes more accessible for analysis.
A new algorithm based on Genetic Programming (GP) for the problem of optimization of Multiple Constant Multiplication (MCM) by Common Subexpression Elimination (CSE) is developed. This method is used for hardware optimization of DSP systems. A solution based on GP is shown in this paper. The performance of the technique is demonstrated in one- and multi-dimensional digital filters with constant coefficients.
Abstract Background Accurate prevalence estimates of drug use and its harms are important to characterize burden and develop interventions to reduce negative health outcomes and disparities. Lack of a sampling frame for marginalized/stigmatized populations, including persons who use drugs (PWUD) in rural settings, makes this challenging. Respondent-driven sampling (RDS) is frequently used to recruit PWUD. However, the validity of RDS-generated population-level prevalence estimates relies on assumptions that should be evaluated. Methods RDS was used to recruit PWUD across seven Rural Opioid Initiative studies between 2018-2020. To evaluate RDS assumptions, we computed recruitment homophily and design effects, generated convergence and bottleneck plots, and tested for recruitment and degree differences. We compared sample proportions with three RDS-adjusted estimators (two variations of RDS-I and RDS-II) for five variables of interest (past 30-day use of heroin, fentanyl, and methamphetamine; past 6-month homelessness; and being positive for hepatitis C virus (HCV) antibody) using linear regression with robust confidence intervals. We compared regression estimates for the associations between HCV positive antibody status and (a) heroin use, (b) fentanyl use, and (c) age using RDS-1 and RDS-II probability weights and no weights using logistic and modified Poisson regression and random-effects meta-analyses. Results Among 2,842 PWUD, median age was 34 years and 43% were female. Most participants (54%) reported opioids as their drug of choice, however regional differences were present (e.g., methamphetamine range: 4-52%). Many recruitment chains were not long enough to achieve sample equilibrium. Recruitment homophily was present for some variables. Differences with respect to recruitment and degree varied across studies. Prevalence estimates varied only slightly with different RDS weighting approaches, most confidence intervals overlapped. Variations in measures of association varied little based on weighting approach. Conclusions RDS was a useful recruitment tool for PWUD in rural settings. However, several violations of key RDS assumptions were observed which slightly impacts estimation of proportion although not associations.
A journal's quality is often assessed by its impact factor, a measure of the number of times a journal's published articles are cited in the scientific literature. However, the impact factor may not adequately measure a journal's influence on practice. As an alternative approach, we analyzed referenced articles of the 2015 and 2021 Centers for Disease Control and Prevention Sexually Transmitted Infections (STI) Treatment Guidelines, arguably the most influential document on STI prevention and care in the Unites States.Referenced articles in the 2015 and 2021 guidelines were abstracted and analyzed by source and year of publication, and sources were ranked by frequency of citation.Of 892 citations in 2015 and 1454 citations in 2021, the most frequently cited reference sources included the journals Sexually Transmitted Diseases (14.0% and 12.8% in 2015 and 2021, respectively), Clinical Infectious Diseases (7.5% and 8.2%), and Sexually Transmitted Infections (5.6% and 6.4%).Sexually transmitted infection specialty journals influence STI prevention and practice beyond what would be expected from the journals' impact factor alone.
Several reports have suggested relations of alcohol abuse to level of control experienced over various life pressures or forces. This study assessed test-retest reliability of the Experienced Control Scale (EC) (Tiffany, 1967) within a male alcoholic sample. The EC was completed on two occasions 1 week apart by 48 inpatients on an alcoholism treatment unit. Resulting test-retest reliability coefficients were .57 for the Internal ratio score, .79 for the External ratio score, .72 for the sum of the two ratio scores, and from .56 to .69 for the four basic scores used in computing ratio scores. Intellectual ability as assessed by the Shipley Institute of Living Scale was unrelated to EC scores and occasionally but conflictingly related to temporal stability of EC scores. Neither age nor education showed a significant relationship to temporal stability of the EC or to ratio scores. Implications of findings for clinical and research applications of the EC are discussed, particularly support for combining the ratio scores rather than treating them separately. Possible determinants of the obtained stability of the EC also are explored.
In this paper, methods for generating 3-variable Very Strictly Hurwitz Polynomials (VSHP's) are presented. It is shown that the derived 3-variable VSHP's could be used to design stable three-dimensional digital filters satisfying prescribed magnitude specifications.