Discussion
Issues related to the use of primary care data for research are complex. Government reimbursement system administrative data have limitations as they lack clinical detail. General practice electronic medical record data are more suitable; however, challenges include variable data quality and interoperability. There are concerns from general practices and the public about data access and use. Strategies to address these issues include incorporating best-practice principles, implementing standards and data quality frameworks, creating partnerships between data custodians and ensuring robust governance systems exist. Leadership and the will of key stakeholders to reform, with governmental support in implementing required actions, must be prioritised.
This article is part of a longitudinal series on research.
Interest in secondary use of primary care data for research is growing, as evidenced by the formal establishment of government-led data collection initiatives for research (NPS MedicineWise, The Health Improvement Network, Clinical Practice Research Datalink) and the increasing number of research publications utilising these data over time. Primary care data, used for non-clinical purposes (secondary use), is typically gathered from administrative and clinical sources through data-sharing agreements with the original data holder. Administrative data includes ambulatory care service and medication dispensing reimbursement data captured from the Australian Medicare Benefits Schedule (MBS) and the Pharmaceutical Benefits Scheme (PBS). Clinical data can be obtained by government agencies, universities or other organisations directly from electronic medical record (EMR) systems embedded into clinical information systems in general practices.1 EMRs have been part of general practice in Australia for decades, with large volumes of data continuously generated and stored.2,3 In addition to clinical care, primary care data can be used for a multitude of research activities, such as longitudinal cohort, interventional and comparative studies, big data analytics for randomised controlled trials and predictive modelling.4 However, these data are currently underutilised for research in Australia compared with similar countries, despite the known positive effects,4,5 as acknowledged by Australia’s Productivity Commission.6,7
Aim
This article aims to explore issues associated with the use of primary care data for research in Australia – in particular data quality, interoperability, linkage and access – and propose solutions to address them.
Issues with primary care data
Administrative data from both the MBS and the PBS have very good coverage and quality; however, they do not contain the necessary detail required for a broad range of primary care research.1 Details such as patient diagnoses, test results, observation measurements and prescribing instructions, which provide the clinical context needed to answer research questions, are absent from administrative data.1,8 This was evident in early studies that linked MBS and PBS data to state healthcare datasets to examine the effects of primary care on hospitalisations and mortality.4,9,10 Due to the limited clinical information available, many assumptions needed to be applied to derive meaning from the findings.4 Clinical data from general practice EMR systems are more suitable for this purpose; however, these data also carry inherent issues.
EMRs used in Australian general practice were primarily designed to improve administrative and clinical workflows, including Medicare claims management. The use of captured data for research was not a design consideration.4 These EMR systems were developed independently with unique schema for medical terminology and clinical coding, thus preventing direct interoperability,4,11 and a reliance on free-text as opposed to coded data entry.1 The absence of standardised data practices has resulted in inconsistent approaches to storing and reporting information for secondary use. This has led to suboptimal data quality because systems often allow unstructured free-text entries rather than coded ones.1,4,11
Lack of interoperability between EMR systems and data extraction tools and the absence of accreditation to ensure data are standardised to a common data model contribute to varying formats of, and repositories for, data storage.1,4,12 As a result, there are challenges when research requires information aggregation of data across practices using different EMR systems and data extraction tools.1,4,13 Furthermore, a widely used commercially developed data extraction tool has been described as ‘a barrier to better use of primary care data’ due to its associated inflexible legal and data governance arrangements.13
Access to primary healthcare data and linked datasets are major issues for research. General practice EMR data are regularly collected by Primary Health Networks (PHNs), Australian government-funded independent organisations whose role is to assess primary and community healthcare, report to government and commission services for quality improvement purposes.14 Data gathered are used for quality improvement activities (ie performance feedback) and to inform health service planning and policy development.
Given established pathways for PHN use of EMR data, access to these data for research purposes within the university sector is not as streamlined.1,13 Research involving primary care data is often carried out in ‘research silos’, thereby limiting opportunities for ‘big picture’ research collaborations. Data access barriers also limit the use of EMR data for research. These barriers include the protracted time to gain approval from data custodians and for data access once approval is gained.13 The reticence of general practitioners and other holders of primary care data to share it can also be attributed to a general lack of trust linked to fears around potentially poor data security and privacy, questions regarding ownership of data once shared and reputational and financial damage should there be any data breach.5 Financial constraints might also prevent secondary use: access fees imposed by custodians might range from a modest flat fee to many tens of thousands of dollars.5
Potential solutions
Addressing the issues associated with secondary use of primary care data requires a comprehensive approach, as these issues are multilayered at the data, technology and system levels. The development and application of clinician-, researcher- and consumer-agreed best practice principles for appropriate use of health data are needed to ensure healthcare provider and public trust. Best practices include de-identification of data before it is extracted into a repository, governance committees independent of the data custodian/managers to make decisions about use of the data on behalf of the public, transparent governance processes, robust security systems and provision of the minimum required information to answer research questions.4,15–19
Issues with data quality might be addressed using a workforce approach. Primary care workforce training for best practice data collection has been in place since the rollout of the Australian Government-funded Practice Incentives Program (PIP) Quality Improvement (QI) Incentive;20 notwithstanding its successes, PIP QI is limited by EMR design. Increased provision of clinician health informatician support to ensure appropriate data capture and interpretation5 and improved data collection tools that focus on data quality and continuity21,22 might also be helpful.
To improve data quality output from EMR systems, a suite of standards must be adopted, owned and implemented at scale. These include defined data models that establish linkages between related data elements, consistency between data element labels and definitions, use of standardised clinical terminologies and classifications, and the introduction of an accreditation process for quality assurance.4,11 Data models that support high-quality care already exist; these include (HL7 FHIR) and (openEHR). Additionally, widely used standardised clinical terminologies (SNOMED CT and Australian Medicines Terminology) have been mapped to the International classification of diseases, 10th revision.23 The incorporation of data quality frameworks, such as Kahn’s harmonised data quality framework,24 also enables rigorous assessments of data quality, including fitness for purpose assessments, to be performed.13 For research use, interoperability challenges between EMR systems and extraction tools can be addressed by mapping data to common data models, such as the Observational Medical Outcomes Partnership Common Data Model,25 and ensuring data extraction packages are capable of working across multiple software packages.5
Improved data linkage for research can be attained by establishing accountable partnerships between the various stakeholders, such as universities, PHNs and government.13 Provision of incentives and additional funding should also be considered to encourage the sharing of data between these entities. Such partnerships can enable the possibility of a centralised coordination model for primary care data; this will improve research capacity through improved data quality, timely access, reduced duplication of effort and the ability to link to gold standard datasets. Concerns of privacy loss associated with linkage can be mitigated by ‘privacy-preserving record linkage’,26 which involves irreversibly coding patient identifiers prior to extraction and linkage.1,4 EMR de-identification, where all patient and provider identifiers in the data are removed, enables data within the EMR to be used or shared in ways that might not otherwise be permitted under the Privacy Act 1988.27 Privacy concerns pertaining to public and healthcare provider trust in the secondary use of data in research need further consideration. Consistency and transparency in governance systems in research, including the provision of secure research environments, researcher contractual obligations, sharing of data breach risk mitigation and management strategies with consumers, mandatory research training and proactive standard operating procedures, are necessary to gain this trust.13 Effective communication of this information is equally important to allay fears, especially around data security and sharing availability and preferences.13 The Royal Australian College of General Practitioners’ checklist for the secondary use of de-identified data28 and guiding principles for managing requests for the secondary use of de-identified data29 are valuable resources that will help general practices manage requests for access to their data. These documents can empower healthcare providers to make informed decisions regarding their EMRs and to overcome any initial doubts or concerns they might have with research-related data requests.
Steps have been taken by the Australian Health Research Alliance’s Transformational Data Collaboration30 to address some of these issues: improving health data useability through the development of tools and methods to improve data integration and harmonisation; and increasing user capacity by providing cost-free common data model training for researchers.5,30 The success of this initiative relies on widespread professional, consumer and vendor support, along with the establishment of clear and enforced timescales and, potentially, the provision of regulatory incentives to break the status quo. Leadership from key stakeholders (professional bodies, universities, primary health networks and data custodians) with governmental support and funding is required to enable a national, cohesive approach to the development and implementation of standards for general practice EMRs and improve data quality.11 Policy and governance reforms to improve access and linkage between practice and research will enable the aforementioned ‘big picture’ collaborations through integration of data currently housed in different repositories and more fluid data sharing.5 This will improve the current poor data utilisation and reduce the inefficiencies and unnecessarily high economic burden of duplication of effort.13
Conclusion
Primary care data are a rich source of information that can contribute to healthcare improvement through research. Unfortunately, many challenges hinder the optimal use of these data. Issues include challenges with data quality and access and data custodian fears of compromised privacy. Strategies to address these matters include incorporating evidence-based principles of best practice, implementing EMR system standards and data quality frameworks, creating accountable partnerships between data custodians, ensuring the transparency of professional and consumer input and having robust governance systems in place. Leadership from key stakeholders with governmental support in implementing standards across EMR systems and national legislation to ensure harmonisation of health data use must be prioritised.
Key points
- The use of general practice EMR data provides opportunities to undertake large-scale observational research. Poor data quality, limitations in the necessary structures to facilitate interoperability, lack of implementation of best practice for the capture of an ‘optimal’ dataset, linkage barriers, privacy concerns and limits to access all need to be overcome to facilitate appropriate use.
- Administrative data derived from national healthcare reimbursement schemes contain robust data; however, these data lack the clinical detail required for clinical-related primary care research.
- The application of best practice principles for the appropriate use of health data for research is crucial to establish and maintain trust among data custodians and the public to ensure continued access to data for research.
- Effective leadership from professional bodies, universities, primary health networks and data custodians, along with governmental support, are required to drive the necessary changes to address primary care data issues.
- Australia is said to ‘lag behind’ comparable countries in the secondary use of health data.