Referral paths in the U.S. physician network

An, Chuankai; O’Malley, A. James; Rockmore, Daniel N.

doi:10.1007/s41109-018-0081-4

Research
Open access
Published: 31 July 2018

Referral paths in the U.S. physician network

Applied Network Science volume 3, Article number: 20 (2018) Cite this article

5943 Accesses
9 Citations
7 Altmetric
Metrics details

Abstract

In this paper, we analyze the millions of referral paths of patients’ interactions with the healthcare system for each year in the 2006-2011 time period and relate them to U.S. cardiovascular treatment records. For a patient, a “referral path” records the chronological sequence of physicians encountered by a patient (subject to certain constraints on the times between encounters). It provides a basic unit of analysis in a broader referral network that encodes the flow of patients and information between physicians in a healthcare system. We consider referral networks defined over a range of interactions as well as the characteristics of referral paths, producing a characterization of the various networks as well as the physicians they comprise. We further relate these metrics and findings to outcomes in the specific area of cardiovascular care. In particular, we match a referral path to occurrences of Acute Myocardial Infarction (AMI) and use the summary measures of the referral path to predict the treatment a patient receives and medical outcomes following treatment. Some referral path features are more significant with respect to their ability to boost a tree-based predictive model, and have stronger correlations with numerical treatment outcome variables. The patterns of referral paths and the derived informative features illustrate the potential for using network science to optimize patient referrals in healthcare systems for improved treatment outcomes and more efficient utilization of medical resources.

Introduction

A well-designed healthcare system is a crucial element of a well-functioning society, and the ways in which information and resources flow in such a system are key determinants of its efficacy. Patient referrals serve as a useful and measurable proxy for communication and collaboration between physicians in different specialties (Burns and Muller 2008). Physicians refer a patient to other physicians who will either be within or outside of their own hospital, generally (although not exclusively) for considerations relevant to the care of patients. To that end, reasons range from the need for specialized care to addressing problems of overcrowding (and thus postponing care). In aggregate, the sequence of referrals – a referral path – for a given patient over the course of treatment for a given concern is thus an important record of a focused interaction of a patient with the healthcare system. It represents a collection of pairwise and possibly group information sharing opportunities about treatment among two or even a team of physicians involved in the treatment of a patient.

The language (and mathematics) of network science is well-adapted to the study of such discretized and localized information and resource flow. In the particular case of healthcare we use network models and measures as a way of understanding patient care, healthcare resource allocation and treatment efficiency. To that end the referral of a patient by physician A to physician B is naturally represented as a directed edge from one network node to another. A referral path also stores the date of the visit and interactions between a patient and each physician on the path. Possibly because of specialty, different physicians might spend uneven amounts of time and effort (e.g., as measured by the relative value unit or “RVU”^{Footnote 1}) during a typical encounter with a patient. We describe the referral path in terms of multiple features (e.g., time between initial and final encounters or average RVU). Domains of investigation can range from the network of physicians in or attributed to a hospital, the Hospital Referral Region (HRR), or the entire United States referral network. A range of choices for edge weights can articulate different properties of these interactions. Given groups of referral network structural measures and referral path features, multilevel regression models and classification methods in machine learning have the potential to reveal relationships between the organization of patient flow in the healthcare system and the well-being of patients, and with this, insights into improving efficacy and resource allocation for our healthcare system.

In earlier work (An et al. 2018) we presented an analysis of the U.S. patient referral network, subjecting it and its HRR and state-level subnetworks to a range of network analyses to uncover their large-scale network structure. This work built on earlier work (Landon et al. 2012; Mandl et al. 2014; Lee et al. 2011; Lomi et al. 2014; Donker et al. 2010; Shea et al. 1999). Example results include the existence of power laws in degree distributions, “small-world” and core-periphery structures, and a statistical analysis of the motif structures in these networks. A suite of regressions also uncovered interesting relationships among the various network metrics. In this paper we study the more fine-scale patterns to be found in the consideration of the referral paths and importantly link these statistics to treatment outcomes in the particular setting of cardiovascular disease. While referral path and referral information generally has been ignored as a factor in the important problem of treatment outcome prediction, the predictive value of other kinds of data have been studied. In (Fiterau et al. 2017), researchers applied deep neural networks to time series of sensory data to predict other diagnoses. Several works (Liaw 2009; Ellis et al. 2008; Ball et al. 2014) in medical research mainly focused on variables from clinical medical tests and used standard statistical analysis techniques to make inferences about the relationships of treatments to outcomes.

Prior studies related to referral paths have been limited in terms of the range of health records studied (Uddin et al. 2013; Uddin 2016). In this paper we introduce new metrics related to the study of referral paths and are able to compute detailed network measures in a much larger dataset (the TDI ^{Footnote 2} dataset) of cardiovascular disease treatment, ranging from a local hospital or HRR to the current national referral network. Aggregating the data from thousands of local hospitals and hundreds of HRRs, we use statistical methods to validate the general patterns of referral paths and referral networks. We characterize the dynamics of changes of node position and type among all physicians on a referral path. In the case of cardiovascular treatment, we find evidence of key roles on a referral path, especially for the physicians with a specialty of cardiovascular and internal medicine. We also validate the prevalence of patterns of referrals indicating that physicians work with their professional acquaintances when choosing the target of a referral, i.e., regularly send patients to the physicians who have many common collaborators. We then apply classification models to the cardiovascular referral network measures and referral path features to predict teaching status of a hospital and a patient’s treatment outcome (e.g., indicator of death within 1 year after treatment). Our considerations of networks and referral paths for cardiovascular treatment could clearly be adapted for other contexts. More specifically, given patient referral records tied to a different disease state, the metrics and methodologies we introduce here (e.g., the feature and pattern mining, model selection, analysis, etc.) could be directly adapted. In addition, our study has implications for research about a generalized notion of a referral path in such contexts as information flow in online media or social networks.

Some specific contributions of our work include:

Novel definition of the health records-based referral path as well as novel definition of salient features for referral paths generated from both network science and time series analysis.
Quantification of a physician’s position using centrality and other measures in the U.S. national cardiovascular referral network with the help of techniques specific to big data that are necessary for overcoming the infeasibility of using traditional algorithms for calculations at scale.
Investigation of the patterns of millions of referral paths in the referral network, which are validated by statistical tests.
Effective classification and regression models derived from novel referral path features and referral networks that distinguish (a) teaching status of a hospital and (b) patient treatment outcomes. These models pick up key predictors among network measures relevant to the optimization of an effective healthcare system.

Materials, notation, and methodology

Materials

We used Medicare beneficiary claims data for all patients diagnosed with cardiovascular disease in the U.S. during 2006-2011 to build referral paths and networks of the US healthcare system. Here cardiovascular disease means that the patient suffers from arrhythmia, congestive heart failure, coronary-heart disease or peripheral vascular disease in the diagnostic codes of Medicare claims. This dataset is of interest for several reasons. It is on the one hand a kind of network “big data” (as we will see, the data produce networks on hundreds of thousand of nodes and millions of edges) in a research area (healthcare) where traditionally data analysis has not been accomplished at this scale (i.e., related work considers data at the level of the health care unit – e.g., hospital – or a local region). In our previous work (An et al. 2018) related to national networks we had much less metadata - so that our work was more descriptive. This richer data enables us to begin to create more interesting methodologies for this kind of data. In particular, by focusing on the part of the national dataset related to a disease diagnosis, we can begin to articulate and build out methodologies that relate to outcomes. With the exception of patients dually eligible for Medicare and Medicaid, these data contain a record of each physician encounter of each Medicare patient. Each such record contains the patient or “beneficiary” (Bene) identification (ID) number, physician National Provider Identification (NPI) number, visit date, RVU associated with the visit and other details^{Footnote 3}. Since the NPI numbers for all physicians changed in 2007, some of the analysis we perform only obtains for the interval 2007-2011. Although claims data and other sources of patient-physician encounters has been previously used to form physician networks (An et al. 2018; Landon et al. 2012; Mandl et al. 2014; Lomi et al. 2014; Shea et al. 1999), in this paper we apply a more nuanced approach.

At the heart of this is the notion of a “referral from physician A to physician B”, which we define as the event that a patient encounters physician B within 30 days of encountering physician A (and encounters no other physician in between those times). The “referral path” is a maximal sequence of referrals, assumed to embody the team of physicians involved in the treatment of a patient over the course of a given episode of illness. A referral path might connect physicians in different areas. Since each visiting record includes the HRR and hospital where a physician is working or attributed on the basis of where most of their patients are hospitalized (Bynum et al. 2007), we can categorize referral paths as purely intra- versus inter-hospital or HRR. Similarly, the various network measures to be able to be evaluated for each HRR or hospital level (PHN) subnetwork. In this paper we will be primarily interested in cardiovascular referral networks, since the raw records of patient-physician visits are specific to cardiovascular disease treatment in U.S.

Referral path

The relationship between patients and physicians is naturally represented as a bipartite graph. In Fig. 1, several edges connect two patients (α and β) to some physicians whom they have visited. Patient α visits four physicians in the sequence (A,B,C,D) and patient β visits B and C. By sorting the four physicians according to the date of patient α’s visit, we recover a sequence of four physicians reflecting the sequence of encounters. In this paper, we define a patient referral path as a sequence of physicians whom the patient encounters in chronological order. If a patient encounters a physician followed by another within a threshold of 30 days (i.e., a referral exists), we assume there is an information exchange opportunity between the two physicians.

Referral network and computation of edge weights

The referral network (over a given time period) is a directed network with node set given by the physicians present in the database over a fixed time period. If physician A refers at least one patient to physician B, this is represented by a directed edge from A to B. Given all referrals over a year, we are able to build the U.S. national patient referral network of US physicians. In this paper, we mainly investigate micro-patterns of referral paths for each patient in HRR/PHN referral networks, while our prior work (An et al. 2018) introduces macro-patterns derived from directed national, HRR, and state referral sub-networks. Herein, most of the network measures are also derived from directed referral networks, except a few measures from the corresponding undirected networks, such as diameter, clustering coefficient and giant component.

Edges can be weighted in a variety of ways. A simple unweighted edge (i.e., edge weight equal to 1) denotes simply a connection. More information is added if we use other natural metrics such as the number of referrals or the geometric mean of RVUs. A novel metric that we define here is the “ranking based weight”: Let the vector r=(1,2,…,n) denote the chronological “ranks”^{Footnote 4} of the encounters on a referral path consisting of n physicians. In this case for a given physician A, let n_A denote the number of encounters for physician A on the referral path, and let r_A be the sub-list of the ranks of the encounters with a physician in the referral path (so, if A was encountered on the first and last visits only, then r_A=(1,n)). In this way, n_A is the length of the r_A. The flow of patients from physician A to physician B is then given by

$$ f_{AB} = \frac{ {\sum\nolimits}_{i<j} I\left(r_{Ai}<r_{Bj}\right) }{n_{A}n_{B}} $$

(1)

and from B to A by

$$ f_{BA} = \frac{ {\sum\nolimits}_{i<j} I\left(r_{Ai}>r_{Bj}\right) }{n_{A}n_{B}}\quad. $$

(2)

To compute the ranking based weight of an edge, we compute a weighted sum of the patient ranking index flow in each referral path p containing both physician A and B. A referral path p might include multiple physicians, but the flow of patients in the referral path between physician A and B only relates to their sub-vectors r_A and r_B, without any impact from a third physician. The function of Eq. (1) lies in [0, 1], and under the assumption of a stationary model of doctor’s visit occurrence it will converge to a constant as n_A and n_B go to infinity, but we would like to account for the length of each referral path, so we add n_Ap and n_Bp and weight the contribution from each referral path by its geometric mean in Eq. (3).

$$ w_{AB}=\sum\limits_{p} \left(n_{Ap}n_{Bp}\right)^{1/2} f_{ABp} $$

(3)

To sum up, Table 1 shows an interim step of the data processing process with the format of input data and the output of referral paths/networks.

Table 1 Example pipeline of data processing from raw patient-physician encounter records to referral paths and edges of referral network

Referral paths in the U.S. physician network

Abstract

Introduction

Materials, notation, and methodology

Materials

Referral path

Referral network and computation of edge weights

Referral path features

Node position features

Results

National, HRR and PHN network measures

Referral path features

Patterns of referral paths

Three illustrative analyses

Teaching status classification

Patient clinical outcome and treatment received classification

Linear regression analysis of log(total 1yr payments)

In-depth study of a hospital

Conclusions

Notes

Abbreviations

References

Acknowledgements

Funding

Availability of data and materials

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords