Notes
![]() ![]() Notes - notes.io |
Identifying patient characteristics that influence the rate of colorectal polyp recurrence can provide important insights into which patients are at higher risk for recurrence. We used natural language processing to extract polyp morphological characteristics from 953 polyp-presenting patients' electronic medical records. We used subsequent colonoscopy reports to examine how the time to polyp recurrence (731 patients experienced recurrence) is influenced by these characteristics as well as anthropometric features using Kaplan-Meier curves, Cox proportional hazards modeling, and random survival forest models. We found that the rate of recurrence differed significantly by polyp size, number, and location and patient smoking status. Additionally, right-sided colon polyps increased recurrence risk by 30% compared to left-sided polyps. History of tobacco use increased polyp recurrence risk by 20% compared to never-users. A random survival forest model showed an AUC of 0.65 and identified several other predictive variables, which can inform development of personalized polyp surveillance plans.Individuals increasingly rely on social media to discuss health-related issues. One way to provide easier access to relevant in- formation is through sentiment analysis - classifying text into polarity classes such as positive and negative. In this paper, we generated freely available datasets of WebMD.com drug reviews and star ratings for Common, Cancer, Depression, Diabetes, and Hypertension drugs. We explored four supervised learning models Naive Bayes, Random Forests, Support Vector Machines, and Convolutional Neural Networks for the purpose of determining the polarity of drug reviews. We conducted inter-domain and cross-domain evaluations. We found that SVM obtained the highest f-measure on average and that cross-domain training produced similar or higher results to models trained directly on their respective datasets.Modern electronic health records (EHRs) provide data to answer clinically meaningful questions. The growing data in EHRs makes healthcare ripe for the use of machine learning. However, learning in a clinical setting presents unique challenges that complicate the use of common machine learning methodologies. For example, diseases in EHRs are poorly labeled, conditions can encompass multiple underlying endotypes, and healthy individuals are underrepresented. This article serves as a primer to illuminate these challenges and highlights opportunities for members of the machine learning community to contribute to healthcare.Hypotension in critical care settings is a life-threatening emergency that must be recognized and treated early. While fluid bolus therapy and vasopressors are common treatments, it is often unclear which interventions to give, in what amounts, and for how long. Observational data in the form of electronic health records can provide a source for helping inform these choices from past events, but often it is not possible to identify a single best strategy from observational data alone. In such situations, we argue it is important to expose the collection of plausible options to a provider. To this end, we develop SODA-RL Safely Optimized, Diverse, and Accurate Reinforcement Learning, to identify distinct treatment options that are supported in the data. We demonstrate SODA-RL on a cohort of 10,142 ICU stays where hypotension presented. Our learned policies perform comparably to the observed physician behaviors, while providing different, plausible alternatives for treatment decisions.The effective use of EHR data for clinical research is challenged by the lack of methodologic standards, transparency, and reproducibility. For example, our empirical analysis on clinical research ontologies and reporting standards found little-to-no informatics-related standards. To address these issues, our study aims to leverage natural language processing techniques to discover the reporting patterns and data abstraction methodologies for EHR-based clinical research. We conducted a case study using a collection of full articles of EHR-based population studies published using the Rochester Epidemiology Project infrastructure. Our investigation discovered an upward trend of reporting EHR-related research methodologies, good practice, and the use of informatics related methods. For example, among 1279 articles, 24.0% reported training for data abstraction, 6% reported the abstractors were blinded, 4.5% tested the inter-observer agreement, 5% reported the use of a screening/data collection protocol, 1.5% reported that team meetings were organized for consensus building, and 0.8% mentioned supervision activities by senior researchers. Despite that, the overall ratio of reporting/adoption of methodologic standards was still low. There was also a high variation regarding clinical research reporting. Thus, continuously developing process frameworks, ontologies, and reporting guidelines for promoting good data practice in EHR-based clinical research are recommended.Reliable cohort discovery is an essential early part of clinical study design. Indeed, it is the defining feature of many clinical research networks, including the recently launched Accrual to Clinical Trials (ACT) network. As currently deployed, however, the ACT network only allows cohort queries in isolated silos, rendering cohort discovery across sites unreliable. Here we demonstrate a novel protocol to provide network participants access to more accurate combined cohort estimates (union cardinality) with other sites. A two-party Elgamal protocol is implemented to ensure privacy and security imperatives, and a special attribute of Bloom filters is exploited for accurate and fast cardinality estimates. To emulate mandatory privacy protecting obfuscation factors (like those applied to the counts reported for individual sites by ACT), we configure the Bloom filter based on the individual site cohort sizes, striking an appropriate balance between accuracy and privacy. Finally, we discuss additional approval and data governance steps required to incorporate our protocol in the current ACT infrastructure.Healthcare analytics is impeded by a lack of machine learning (ML) model generalizability, the ability of a model to predict accurately on varied data sources not included in the model's training dataset. We leveraged free-text laboratory data from a Health Information Exchange network to evaluate ML generalization using Notifiable Condition Detection (NCD) for public health surveillance as a use case. We 1) built ML models for detecting syphilis, salmonella, and histoplasmosis; 2) evaluated generalizability of these models across data from holdout lab systems, and; 3) explored factors that influence weak model generalizability. Models for predicting each disease reported considerable accuracy. However, they demonstrated poor generalizability across data from holdout lab systems being tested. Our evaluation determined that weak generalization was influenced by variant syntactic nature of free-text datasets across each lab system. Results highlight the need for actionable methodology to generalize ML solutions for healthcare analytics.Drug-drug interactions (DDI) can cause severe adverse drug reactions and pose a major challenge to medication therapy. Recently, informatics-based approaches are emerging for DDI studies. In this paper, we aim to identify key pharmacological components in DDI based on large-scale data from DrugBank, a comprehensive DDI database. With pharmacological components as features, logistic regression is used to perform DDI classification with a focus on searching for most predictive features, a process of identifying key pharmacological components. Using univariate feature selection with chi-squared statistic as the ranking criteria, our study reveals that top 10% features can achieve comparable classification performance compared to that using all features. The top 10% features are identified to be key pharmacological components. Furthermore, their importance is quantified by feature coefficients in the classifier, which measures the DDI potential and provides a novel perspective to evaluate pharmacological components.With the increasing use of social media data for health-related research, the credibility of the information from this source has been questioned as the posts may not from originating personal accounts. While automatic bot detection approaches have been proposed, none have been evaluated on users posting health-related information. In this paper, we extend an existing bot detection system and customize it for health-related research. Using a dataset of Twitter users, we first show that the system, which was designed for political bot detection, underperforms when applied to health-related Twitter users. We then incorporate additional features and a statistical machine learning classifier to improve bot detection performance significantly. Our approach obtains F1-scores of 0.7 for the "bot" class, representing improvements of 0.339. Our approach is customizable and generalizable for bot detection in other health-related social media cohorts.Mapping local terminologies to standardized terminologies facilitates secondary use of electronic health records (EHR). Penn Medicine comprises multiple hospitals and facilities within the Philadelphia Metropolitan area providing services from primary to quaternary care. Our Penn Medicine (PennMed) data include medications collected during both inpatient and outpatient encounters at multiple facilities. Our goal was to map 941,198 unique medication terms to RxNorm, a standardized drug nomenclature from the National Library of Medicine (NLM). We chose three popular tools for mapping NLM's RxMix and RxNav-in-a-Box, OHDSI's Usagi and Mayo Clinic's MedXN. We manually reviewed 400 mappings obtained from each tool and evaluated their performance for drug name, strength, form, and route. RxMix performed the best with an F1 score of 90% for drug name versus Usagi's 82% and MedXN's 74%. We discuss the strengths and limitations of each method and tips for other institutions seeking to map a local terminology to RxNorm.In this paper, we investigate the task of spatial role labeling for extracting spatial relations from chest X-ray reports. Previous works have shown the usefulness of incorporating syntactic information in extracting spatial relations. We propose syntax-enhanced word representations in addition to word and character embeddings for extracting radiologyspecific spatial roles. We utilize a bidirectional long short-term memory (Bi-LSTM) conditional random field (CRF) as the baseline model to capture the word sequence and employ additional Bi-LSTMs to encode syntax based on dependency tree substructures. Our focus is on empirically evaluating the contribution of each syntax integration method in extracting the spatial roles with respect to a SPATIAL INDICATOR in a sentence. The incorporation of syntax embeddings to the baseline method achieves promising results, with improvements of 1.3, 0.8, 4.6, and 4.6 points in the average F1 measures for TRAJECTOR, LANDMARK, DIAGNOSIS, and HEDGE roles, respectively.Up to 50% of antibiotic use in hospital settings is suboptimal. We build machine learning models trained on electronic health record data to minimize wasteful use of antibiotics. Our classifiers flag no growth blood and urine microbial cultures with high precision. Further, we build models that predict the likelihood of bacterial susceptibility to sets of antibiotics. These models contain decision thresholds that separate subgroups of patients whose susceptibility rates to narrow-spectrum antibiotics equal overall susceptibility rates to broader-spectrum drugs. Retroactively analyzing these thresholds on our one year test set, we find that 14% of patients infected with Escherichia coli and empirically treated with piperacillin/tazobactam could have been treated with ceftriaxone with coverage equal to the overall susceptibility rate ofpiperacillin/tazobactam. Similarly, 13% of the same cohort could have been treated with cefazolin - a first generation cephalosporin.
Website:
![]() |
Notes is a web-based application for online taking notes. You can take your notes and share with others people. If you like taking long notes, notes.io is designed for you. To date, over 8,000,000,000+ notes created and continuing...
With notes.io;
- * You can take a note from anywhere and any device with internet connection.
- * You can share the notes in social platforms (YouTube, Facebook, Twitter, instagram etc.).
- * You can quickly share your contents without website, blog and e-mail.
- * You don't need to create any Account to share a note. As you wish you can use quick, easy and best shortened notes with sms, websites, e-mail, or messaging services (WhatsApp, iMessage, Telegram, Signal).
- * Notes.io has fabulous infrastructure design for a short link and allows you to share the note as an easy and understandable link.
Fast: Notes.io is built for speed and performance. You can take a notes quickly and browse your archive.
Easy: Notes.io doesn’t require installation. Just write and share note!
Short: Notes.io’s url just 8 character. You’ll get shorten link of your note when you want to share. (Ex: notes.io/q )
Free: Notes.io works for 14 years and has been free since the day it was started.
You immediately create your first note and start sharing with the ones you wish. If you want to contact us, you can use the following communication channels;
Email: [email protected]
Twitter: http://twitter.com/notesio
Instagram: http://instagram.com/notes.io
Facebook: http://facebook.com/notesio
Regards;
Notes.io Team