Member-only story
Medical Record Synthesis using LLM
Patient data are well protected under HIPAA, GDPR, and a similar law in many countries. Researchers, either academic or independent, need access to open-source medical records for their machine learning,AI, or similar projects. Using synthetic medical data might be a solution to be compliant with privacy policies and laws.
This project is leveraging medical record synthesis work with the help of the GPT-3.5 language model.
In the medical record synthesis project, a prompt engineering technique was used to instruct LLM to generate synthetic patients and their medical records.
Objectives
-To study the feasibility of synthetic medical record generation
-To study the correctness and relevance of facts in the notes
-To study the whole process from prompt writing to generation of medical notes in popular database format.
-To apply synthetic medical notes in prototyping and future language model research.
Summary
The first step in generating medical data is to test the basic prompts in ChatGPT. Our goal is to get realistic medical notes in a semi-structured format. The following steps are used to generate medical records from diseases (image 2.0)