Personal Site

These are list of my publications. Brief informal summary of each works is available by clicking on the title of the paper.



  • Identifying Medications that Patients Stopped Taking in Online Health Forums (Honorable Mention) (PDF Slides)
    Jason H.D. Cho, Tony Gao, Roxana Girju
    IEEE International Conference on Semantic Computing 2017

  • 2015

  • Recommending forum posts to designated experts (PDF Slides)
    Jason H.D. Cho, Yanen Li, Roxana Girju, Chengxiang Zhai
    IEEE Big Data 2015

  • 2014

  • Understanding User Intents in Online Health Forums (PDF Slides)
    Thomas Zhang, Jason H.D. Cho, Chengxiang Zhai
    ACM conference on Bioinformatics, Computational Biology and Biomedical Informatics. Newport Beach, CA, 2014

    Summary: When patients post on online health forums, they have different types of intents. Some posts to ask what type of disease they may have, while others may ask if particular medication have side effects. In this paper, we proposed intent-identification algorithm. We derived the intents from existing medical literature. We further showed different intent-distributions. Many people were curious on how they could manage their problems on depression forums. On the other hand, in heart disease and breast cancer forums, people were mostly concerned in the cause of particular symptoms. In the future, we wish to utilize this technique to further help mine health forums.

  • Resolving Healthcare Forum Posts via Similar Thread Retrieval (PDF Slides)
    Jason H.D. Cho, Parakshit Sondhi, Chengxiang Zhai, Bruce R. Schatz
    ACM conference on Bioinformatics, Computational Biology and Biomedical Informatics. Newport Beach, CA, 2014

    Summary: It would be useful if patients can write long queries to retrieve the most relevant forum posts. With this in mind, we proposed various techniques to help users retrieve health-related information from the web, in particular, web forums. First was labeling each sentences using different intents (describing symptoms, describing medications, others). Labeling sentences was actually better than identifying medical terms. We also assigned different weights to different posts in a thread. Category information was very useful as well. We believe this will be useful in further aiding patients to retrieve relevant information.

  • Local Learning for Mining Outlier Subgraphs from Network Datasets (PDF)
    Manish Gupta, Arun Mallya, Subhro Roy, Jason H.D. Cho, Jiawei Han
    Proceedings of the 14th SIAM International Conference on Data Mining, Philadelphia, PA, 2014

    Summary: Graphs encode many entity-relationship that are around us. We aimed at finding subgraph outliers from a network. Outliers are interesting because we can use these to look for suspicious activities in a given network. We utilized optimization technique to encode subgraphs. In particular, a given subgraph is an outlier if 1) subgraph itself ranks high on outlier score or 2) the neighborhood is suspicious.


  • Aggregating Personal Health Messages for Scalable Comparative Effectiveness Research (PDF, Slides)
    Jason H.D. Cho, Vera Q.Z. Liao, Yunliang Jiang, Bruce R. Schatz
    ACM conference on Bioinformatics, Computational Biology and Biomedical Informatics. Washington DC, USA, 2013
    (Treatment lists : Heart Disease, Breast Cancer)

    Summary: We used online health forums to generate hypotheses for Comparative Effectiveness Research (CER). This is an important research area since we can 1) hope CER to cut down on treatment administrating costs, 2) give patients options on what is best for them.
    We used 'preference' (sentiment) of medications/treatments to conduct CER. Despite the limitations of using preference to generate CER hypotheses, our results seemed to be consistent with medical literature.
    For future directions, we are planning on 1) combining multiple forum sources (since our current work used just Medhelp), 2) demographics based on different symptoms and 3) actual effectiveness. Contributions : We showed it is possible to generate hypotheses for use on CER by using web forums (health messages) and introduced high precision demographic extractor.

  • 2011

  • Robust Classification of Curvilinear and Surface-like Structures in 3d Point Cloud Data
    Mahsa Kamali, Matei Stroila, Jason Cho, Eric Shaffer, and John C. Hart
    In Proceedings of ISVC 2011, Journal of Lecture Notes on Computer Science, Springer Verlag

    Summary: We classified whether a data point was 1) ground or 2) complicated structures such as tree branches. This is called classifying 3d point cloud data. Associative Markov Network framework, which used both local features and neighborhood consistency was utilized to classify 3d point cloud data. This method was then compared against Directional Associative Markov Network, and our method performed better.
  • Journal Publications


  • Understanding User Intents in Online Health Forums (PDF, Features used for Training, Readme for Patterns)
    Thomas Zhang, Jason H.D. Cho, Chengxiang Zhai
    IEEE Journal of Biomedical and Health Informatics

    Summary: This is a journal version of the paper with the same title.
  • Guest presentations/Symposiums/Workshop papers

  • Why patients in online communities report not taking SSRIs (Poster paper), American Medical Informatics Association (AMIA), Chicago (November, 2016)

  • Classifying user reported drug experiences to understand noncompliance, Midwest Speech and Language Days (MSLD) , Bloomington (April, 2016)

  • More than a Feeling: Investigating Multi-word expressions about Body Parts and Emotions, Midwest Speech and Language Days (MSLD) , Bloomington (April, 2016)

  • Utilizing Smartphones to Enhance Urine Strip Accessibility (Poster paper), American Medical Informatics Association (AMIA), Washington DC (November, 2014)

  • Gamifying Health Data Collection, Engineering in Medicine and Biology Society (EMBC), Chicago (August, 2014)

  • Addressing Users’ Healthcare Needs through Personal Health Messages, DAIS Seminar, UIUC (October, 2013)