As the leading cause of death in the country,1 cardiovascular disease (CVD) is a substantial economic burden on the US healthcare system and has significant financial implications for payers, providers, and patients.2 Despite the ever-increasing spending on CVD management, contemporary practices for lowering downstream risk and cost of care in patients with CVD remain particularly suboptimal and variable at the national, organizational, practice, and provider levels. Clearly, the current fragmentation and disarray of the healthcare system greatly affect quality of care, cost, and outcomes for both patients with existing disease and those at risk for future CVD.3, 5

Improving healthcare valuedefined as the health quality achieved per dollar spentremains a fundamental challenge for the US healthcare system.3,4 Unsustainable growth of healthcare expenditures and inconsistent clinical outcomes have forced stakeholders to critically re-evaluate how health care is delivered. Overall, these massive gaps and variations in evidence-based care can be attributed to the intricate interplay between widespread differences of opinions among unaided frontline providers,6,7 lack of system incentives for clinical practice standardization in the face of growing complexity of evidence,8,9 and documented clinical uncertainty,10 all of which are occurring in a fragmented healthcare delivery environment. In the context of optimizing health care and maximizing value of delivered services, an operational framework of the learning health system model can be adopted to inform clinical integration and strengthen population health system performance for CVD outcomes using big data analytics and digital application tools (Figure 1).11,12

Figure 1. 

Learning health care system model for big data analytics and digital application tools. CVD: cardiovascular disease

In this review, we discuss the steps needed to construct the big data architecture for value-based healthcare delivery under the framework of population health management. These solutions are primed to enhance CVD care by (1) addressing widely prevalent conditions, the high cost to patients and health systems, and the relevance for multiple clinical domains (prevention, diagnostic, intervention) and stakeholders involved in healthcare delivery, and (2) using a wealth of evidence to inform implementation of best practices to yield the highest return on investment with big-data-guided tools.


With exponential growth in the four Vs of data (volume, velocity, variety, and veracity),13 deriving the fifth V (value) for population health services will significantly depend upon creating a robust data-science platform to capture, store, and organize data from multiple sources.14 However, to date there are few published reports describing big data platforms in the healthcare sector and their accompanying implementation challenges. An excellent example of such a platform was recently developed at Yale University using Hadoop, an open-source distributed data storage, to create integrated data lakes (ie, central data repositories) for large real-time dataset storage. In addition, the platform allows for implementation of robust real-world data analytics platforms.15,16

The implementation and use of these agile integrated data-science platforms offer healthcare organizations the real-time capacity to capture diverse information from the most readily available sources, such as electronic medical records (EMR).15,16 Concurrently, such platforms ensure capacity growth to harmonize additional data sources, such as those derived from imaging, and environmental and social determinants of health, afforded by geocoding, patient surveys, and links to growing institutional biobanking assets that are critical to the success of population health initiatives (Figure 2).15

Figure 2. 

Cardiovascular disease (CVD) population health management big data platform. CV: cardiovascular; HM: Houston Methodist; CDSS: clinical decision support system

Big data provides unique insights into population health management, with implications for lower disease burden, reduced healthcare expenditure, and evidence-based prevention and management approaches.17,18 However, the term big must not take the focus away from the promise and value of small-data-driven personalized care.19 In fact, robust big data platforms and novel data management, processing, and interpretation tools offer exciting opportunities for synergizing the power of big and small data to inform critical advances in precision medicine in the coming years.17,20,21

While data curation and harmonization remain the first steps, perhaps more importantly, these robust infrastructures provide unique opportunities for parallel cloud computing and adoption of multiple open-source analytic packages specifically facilitating the transition from data to knowledge.22,23 However, development, maintenance, and security of these platforms will not only require healthcare system investments in tools described above but also the expertise of an integrated team comprising clinical informaticists (miners) and architects (CVD population health management experts and administrative leadership) to generate and apply high-quality real-world data to identify and address care gaps.15,24


In step 2, the goal is to leverage big data infrastructure addressing unmet actionable information needs of CVD population health management stakeholders. Accomplishing this goal requires developing the right interface, novel analytic methods and data visualization tools, and workflow approaches to convert discrete data sources into high-value insights and inform efforts to maximize system efficiency and create sustainable population health solutions.25

The goal is to develop products and tools to generate and communicate real-world information to support policy-making, positively influence patient and clinician behaviors, and improve CV health. To respond to these imperatives, it is key to consider tools that can effectively use science and informatics to support optimal care delivery at the point of care, generate actionable information for optimal health system performance and population management, and develop innovative approaches to support these tools with big data analytics.26,27 Some of the potential digital tools are discussed below.

Patient Care Gaps Dashboards

Interactive population dashboards with real-time updates from big data platforms and customizable features are critical to providing timely insights on needs/gaps that must be addressed to enhance population health at the organization/practice/provider level (command center), with the ultimate goal of ensuring that the majority of the population receive recommended care for optimized risk.28, 30 These insights can alert operational and clinical leaders, frontline providers, and care teams to population-level outcomes, particularly for disproportionally affected subgroups such as those with missed preventive care (eg, blood pressure BP screening in pregnancy), out-of-range risk factors (such as low density lipoprotein cholesterol LDL-C and BP levels), or those experiencing care gaps (such as patients with atherosclerotic CVD ASCVD who are not on high-intensity statins or aspirin).30,31 These interactive dashboards can be enhanced with traffic-light notifications tracking essential elements to highlight key domains (eg, CVD risk factors, past medical history, medication, labs, screening recommendations).30,31 The dashboard can also create alerts for cases where risk cannot be calculated, such as with missing data on sociodemographic or clinical risk factors.

With real-time updates from big data platforms, care teams can use reports generated from these dashboards for patient outreach, specifically for patients who need online versus one-on-one education and goal-setting meetings to efficiently manage their chronic conditions.31, 33 These tools can also create working lists of high-risk, potentially high-cost patients (such as those with multiple comorbidities) and help prioritize interventions that may require more vigorous care coordination.28 From an operational perspective, these reports will generate benchmarks that can be longitudinally tracked and shared with clinical teams along with specific metrics tied with pay-for-performance incentives for closing care gaps.

Clinical Decision Support Systems

In recent years, there has been a growing interest in exploring the potential role of clinical decision support systems (CDSS) in clinical workflows to bridge the gap between practice and best evidence.7 In this regard, tailored CDSS will be specifically pursued to bridge such gaps at the point of care, reduce disparities, and inform decisions that are transparent and shared between physicians and patients. In turn, this can help overcome challenges posed by limited time and the fact that information required for any given decision is moving beyond unassisted human capacity.

The major focus with CDSS can be broadly categorized into the following groups: providing (a) comprehensive patient overview (needs/gaps), (b) predictive risk estimates at the point of care, and (c) point-of-order feedback and guidance on the appropriateness of diagnostic testing, treatments, and other clinical care decisions. For example, among established therapeutic options, statins remain significantly underutilized in patients at high risk for CVD.8 Similar gaps in evidence-based use of non-statin lipid-lowering, anti-platelet, and heart-failure therapies have been previously reported.34,35

With massive growth in digital registries and big data analytics, pressing questions surrounding quality of suboptimal cardiovascular care (such as lipid-lowering management) can be answered using EMR data from large health systems.36 CDSS tools can digitalize specific population health management goals by providing real-time insights on guideline-concordant choices of drug therapy in relation to comorbid diseases based on eligibility criteria, risk stratification, LDL-C targets, and health system leadership preferences for achieving right care to the right patient at the right time strategies.37

CDSS have been developed, implemented, and evaluated in diverse populations globally. For example, a unique CDSS, EB (evidence-based) Guidelines, was successfully implemented at the Johns Hopkins Health System, giving providers the unique ability to personalize guidelines and treatment plans by allowing interoperability between the CDSS platform (Agile MD) and Epic EMR.38

Similarly, Scripps Green Hospital in California developed and implemented a CDSS to improve heart team efficiency for transcatheter aortic valve implementation using an algorithm and rule-based alert system to integrate data and inform treatment decisions.39 Similar tools have been implemented globally, including computerized decision support systems to manage CVD risk in an EMR setting in the Netherlands40 and CDSS for chronic heart failure management in the UK.41

This domain offers immense possibilities for maximizing aid to front-end providers to ensure optimal care at both population and individual levels. Similar to patient-care gap dashboards, CDSS can be enhanced with traffic-light dashboards that track essential elements (eg, CVD risk factors, past medical history, medication, labs, screening recommendations).42 These front-end tools can empower healthcare providers to discuss disease risk scores with patients based on insights from similar patient cases, more accurately calculate risk, present management options, and explain changes in risk trajectories with specific pathways using what if scenarios.30,32 Prior to widespread adoption, iterative CDSS designs should be extensively tested in pilot conditions and undergo multiple rounds of redesigns in limited settings.

Direct Patient Engagement Applications

Direct patient engagement applications will enhance data generated from healthcare delivery systems, allowing a more comprehensive view of patient health. Ideally, the information will provide further insights into factors associated with health and offer opportunities for its improvement. Digital patient engagement solutions such as Hugo PHR (personal health record) are ideal for patient engagement and collecting information on patient-reported outcomes. Hugo is a unique cloud-based, EMR-linked mobile platform that informs and improves care delivery by integrating clinical systems with ongoing medical research. The platform allows researchers to interact with study participants and share findings.43

Some digital patient engagement platforms, such as those developed by the New York-Presbyterian Hospital (Columbia University Medical Center), use mobile applications such as tablet computers to complement the value of local web-based PHR portals ( and enhance patient participation in the care delivery process; the application uses Microsoft HealthVault to store both clinical and sociodemographic data.44

Another novel big data platform is Taltioni, a national data sharing platform in Finland with smartphone and mobile device applications that allow users to customize their health plans and access tools for blood pressure control, weight management and fitness, and medication schedule.45 These innovative data-sharing and patient-engagement platforms are increasingly being recognized as fundamental components of healthcare delivery.

Second, patient engagement via digital tools can be leveraged to supplement biometric information from wearable devices. Thus, wearable devices or health applications on a smartphone are another promising area for growth.10 Third, such digital solutions can provide a foundation to develop mobile applications for patient engagement.46 Automated alerts can add efficiency via prespecified algorithms and data for creating reports. They can also be linked with outbound message systems to patients covering regular health assessment surveys, appointment and test reminders, and follow-up communication.47 These steps will ensure that care gaps are addressed, necessary management practices are followed, and no high-risk patient falls through the cracks.

We anticipate that these automated products will significantly limit care gaps, increase adoption of evidence-based therapies, and promote compliance and self-management.48,49 There are strong business cases to be made for these strategies across a range of reimbursement models, such as increased revenue among fee-for-service (FFS) beneficiaries, lower cost of care for patients in risk contracts, and maximized pay-for-performance (P4P) reimbursements with better quality scores.

Key Performance Indicator Tools

Population health management teams within healthcare organizations are increasingly faced with mounting pressures to improve cost of care.11 While minimizing variability in production processes, these big-data-driven insights, which so far have been widely adopted for transforming quality and efficiency in almost every other industry, also have the potential to transform healthcare efficiency.12

A prominent big data application for CVD population health management lies in the ability to inform administrative staff to monitor performance and identify patients who are not being appropriately managed according to recommended guidelines.13 We specifically propose creating sets of graphs for screening gaps, risk profiles, prescribing gaps for high-risk populations, and meeting guideline targets for blood pressure and lipids.33 For example, the Agency for Healthcare Research and Qualityas part of the Medicaid Innovation Accelerator Programreleased data visualization best practices for primary care quality improvement to graphically present healthcare quality data for performance dashboards, with the ultimate aim of optimizing health information systems for data extraction.33 The report provides detailed information on selecting the most appropriate graphical illustration for a given comparison (eg, line graphs to depict trends/patterns for time-series relationship, and histograms/frequency polygons to show distribution per division of an interval scale).

Clinical and operational leaders can use resulting information to assess trends and underlying factorsboth patient and care-process levelthereby impacting current best practices. Customized alerts can then be set for specific times, providing information on patients receiving suboptimal care and allowing administrative staff to set up recall and reminder alerts notifying the patient to attend management consultation in person or via telehealth.39,49

These advanced analytic tools can provide insights on variation in care and cost, opportunity identification, and performance tracking.28,31 For example, peer-ranked performance dashboards can be developed for any periods of time that populate de-identified aggregated data for a specified number of performance indicators (eg, current guideline-directed medical treatment rates, rates of target achievement such as LDL-C < 70 mg/dL among CVD patients) to be shared with operational leaders in the entire chain of command.50,51 For each indicator, the performance for each individual and composite indicator (eg, lipid management bundle) can be displayed and benchmarked among peers and entities for a specified reporting period.

From an operational perspective, these reports will allow performance benchmarks that can be longitudinally tracked and shared with clinical teams as well as specific metrics tied with P4P incentives for closing care gaps (allowing filtering by payer, value-based patients, activity center, provider). In essence, these dashboards can overcome the care gaps and optimize population health by serving as the carrot and stick to limit avoidable variation that is reflected in a particular established CV care process.


A critical component to realize is a dedicated set-up catering to these end-to-end learning health systems for achieving desired population health goals. At Houston Methodist (HM), we are uniquely positioned to effectively contribute to regional and national efforts for digital transformation of health care, with the ultimate goal of improving quality of care, patient experiences, and outcomes. Using the big data expertise afforded by the HMH Center for Cardiovascular Computational and Precision Health (C3-PH), we aim to leverage our existing informatics infrastructure to develop a CVD Big Data for Personalized Medicine informatics program, allowing systematic data harmonization for rapid knowledge generation and creation of a robust clinical cardiology enterprise (Figure 3). The main focus for this program is data-driven advances in cardiovascular precision health care, discovery, and population health by harnessing the power of big data, innovative computation paradigms, and artificial machine intelligence.

Figure 3. 

Center for Cardiovascular Computational & Precision Health (C3-PH)operational architecture. ICD: International Statistical Classification of Diseases and Related Health Problems; CPT: Current Procedural Terminology; CV: cardiovascular; COR: Center for Outcomes Research

The governance of the program will include clinical, operational, and informatics leadership. Initial initiatives will include curation and maintenance of system-wide EMR-based registries supporting research initiatives of established clinical divisions. Furthermore, the program will support ongoing and future academic endeavors with needed tools and methods to utilize big data assets with advanced computation and machine intelligence paradigms. These assets and teams will, in turn, provide fertile ground to train the next generation of CV physicianscientists in big data and digital health care applications via intense coursework and in-depth practical projects. We hope that these capacity building initiatives will also advance personalized digital interventions in partnership with health system stakeholders and industry partners.

The C3-PH program will also seek a collaborative partnership with a working group of cardiology and population health management leaders to provide constant feedback and guidance throughout the product development lifecycle. For example, weekly or biweekly schedules will be established for clinical end-users to view and critique the product and interject clinical process knowledge to enable the development of teams, make small adjustments, and clarify development processes of the digital tools. During this process, the executive leadership team will be regularly updated to inform high-level direction and provision of resources to overcome operational barriers. In response to the informational imperative, data collected in a big data platform will be used to tailor insights into optimal care decisions and delivery. At the core of these efforts are the tools and solutions created to complement and strengthen existing strategies for our institution to achieve its health goals and be a national and international leader in employing data science for population health management.

Real-World Utility of Big Data to Promote CV Population Health

It is well established that contemporary patterns of evidence-based practices for ASCVD management remain particularly suboptimal and variable at the national, organizational, practice, and provider levels. For example, it is known that target LDL-C levels (< 70 mg/dL or preferably < 55 mg/dL in patients with established ASCVD, especially very-high-risk patients) can be attained with maximum tolerated statin monotherapy; however, only 6 out of every 10 US adults with ASCVD report any statin use, whereas only 28 report high-intensity statin use (Class 1A recommendation).14 As noted above, similar gaps in evidence-based utilization of advanced lipid therapies have been repeatedly demonstrated in literature.34,35

Big data approaches can translate data science into application by developing tools that enable easier visualization and communication of information that supports clinical decision making and positively influences health. A systematic digital interventions approach can close gaps in care for optimal LDL-C lowering for patients with established CVD (Figure 4).

Figure 4. 

Big data approach for addressing LDL care caps. ASCVD: atherosclerotic cardiovascular disease; PCSK9i: proprotein convertase subtilisin/kexin type 9 inhibitor; LDL: low-density lipoprotein

  • Step 1: Leverage EMR data to create dynamic data registries and dashboards that can provide real-time insights on contemporary practices and care gaps in ASCVD patients, including identifying those with high-risk features and illuminating concurrent use and intensity of lipid-lowering therapies to achieve target LDL-C levels and gaps in lipid-lowering therapies. Patients with ASCVD can be identified using the International Classification of Diseases, 9th and 10th revision, Clinical Modification (ICD9/ICD 10) diagnoses and procedure codes. Selected variables would be extracted from the common data model mapping of EMR for the study population. Relevant covariates for inclusion in the proposed registries/dashboards are listed in Table 1. These registries will leverage standard terminologies and coding systems to enable interoperability with and responsiveness to evolving data standards with development of a web-based interface.
  • Step 2: Analyze registry data to elucidate physician, patient, and systemwide determinants of suboptimal LDL-C goals and appropriate use of guideline-directed LDL-C lowering therapies among ASCVD patients. From there, model the proportion of patients who would achieve guideline-directed LDL-C levels with additional non-statin therapies. These insights will provide critical feedback and inform development of digital solutions and healthcare processes to overcome these barriers.
  • Step 3: Develop, test, and implement a clinical decision support program for outreach to patients not on optimal care, and adopt the American College of Cardiology/American Heart Association cholesterol management guidelines algorithm for escalating lipid management in appropriate high-risk patients not at LDL-C goals. The alerts can include giving providers a monthly list of patients falling through care gaps, direct patient messaging, alerts at point of care with recommendations for best practices, and suggestions for referrals to specialized services.


The power and promise of big data is not without limitations. First and foremost is the issue of safe storage of vast amounts of data, which can be burdensome, regardless of the storage of choice (eg, on site, cloud based, hybrid).17

Second, the fourth V (veracity), is often a concern with large datasets, including EMR, claims, and other big data sources.20 Missing data, coding errors, under/over-representation of certain medical conditions, and other concerns may limit reliability of big data findings.17,52,53 Such concerns must be acknowledged and addressed, to the extent possible, using appropriate methodological tools. Related, cleaning, management, and interpretation of big data requires considerable statistical expertise, which may not be readily available, thereby limiting use and application.

Third, limitations inherent to observational dataincluding selection and information bias, confounding, effect modification, and study design limitations such as lack of causality (when applicable)present additional challenges and limitations and must be accounted for to ensure reliability, precision, and generalizability of estimates.53

Fourth, digital tools such as CDSS have various operational limitations that must be considered. For example, CDSS adoption and application may face significant resistance from healthcare providers, owing to lack of relevant knowledge/training, and possible disruption of clinician workflows.54 CDSS are prone to errors, inaccuracies, and inconsequential reporting/alerts that might require additional physician verification to determine appropriate response.55,56

Further, despite recent innovations, challenges with systems crosstalk is a major CDSS limitation; interoperability among diverse data sources and populations is further reduced by local, regional, and national privacy laws that govern such communication.57,58 Cloud-based systems offer an exciting alternative to ensure generalizability and transportability; however, these systems must comply with applicable privacy and data security laws.57

Fifth, while mobile and wearable devices offer exciting avenues for data sharing, patient engagement, and quality improvement, such innovative solutions are limited to those who have access to digital devices, including android/apple smartphones, tablets, wearable technology (eg, smartwatch). Wearable technology privacy issues may pose additional barriers to efficient patient engagement and data sharing.

Despite these limitations, wide use and application of big data using novel tools offers exciting and often untapped avenues for evidence-based cardiovascular care management on a population level, including clinical risk reduction and quality improvement, development and implementation of targeted medical interventions, and improved overall patient satisfaction.


Health care is at a critical and exciting juncture, an inflection point, where big data applications and tools have tremendous potential to optimize point of care management, enhance cardiovascular healthcare quality and performance, and improve outcomes across large populations. Successful achievement of the goals and objectives discussed above will require a multilayered, multidisciplinary effort to ensure continued advancement of big data by investing in big data platforms, harnessing technology to create software applications focusing on optimizing population health management, developing digital solutions to inform clinical and policy decisions, and optimizing public- and patient-engagement strategies.


  • Existing practices for lowering downstream risk and cost of care are suboptimal for patients with cardiovascular disease (CVD), particularly those at high risk. Massive gaps in evidence-based care at the national, organizational, practice, and provider level can be attributed to variation in provider attitudes, lack of incentives for positive change and care standardization, and observed uncertainty in clinical decision making.
  • Guided by the learning healthcare system model, big data analytics and digital application tools and platforms offer unique opportunities for value-based healthcare delivery and efficient cardiovascular population management.
  • Big data frameworks for real-world CVD population health management must include development of (1) a big data platform; (2) digital tools, such as patient care/population health dashboards and clinical decision support systems (CDSS); (3) direct patient engagement applications, such as mobile and wearable applications/devices; (4) key performance indicator tools, including creating novel digital visualization techniques, and (5) a robust implementation framework that leverages the existing informatics infrastructure for systematic data harmonization and knowledge synthesis.
  • Successful implementation of big data solutions for cardiovascular population health management requires a multidisciplinary approach, including investment in big data platforms, harnessing technology to create novel digital applications, developing digital solutions that can inform the actions of clinical and policy decision makers and relevant stakeholders, and optimizing engagement strategies with the public and information-empowered patients.