library(npi)
library(purrr)
This vignette explores advanced uses of the npi package.
npi
is an R package that allows R users to access the U.S. National Provider
Identifier (NPI) Registry API by the Center for Medicare and
Medicaid Services (CMS). The package makes it easy to obtain
administrative data linked to a specific individual or organizational
healthcare provider. Additionally, users can perform advanced searches
based on provider name, location, type of service, credentials, and many
other attributes.
See the npi::npi vignette for an introduction to the package.
CMS regularly releases full NPI data files here. We
recommend that users download the data file if they need to work with
the entire dataset. The API and npi_search()
returns a
maximum of 1,200 records. Also consider downloading the entire data if
you need to work with more than the maximum. Data dissemination files
are zipped and will exceed 4GB upon decompression.
npi_search()
on multiple search termsnpi_search()
enables search for a defined set query
parameters. The function is not designed for search on multiple values
of the same argument at once, as for example in the case of multiple NPI
numbers in a single function call. However, users can still serially
execute searches for multiple values of a single query parameter by
using npi
in combination with the purrr
package. In
the example below, we search multiple NPI numbers. A single tibble is
returned with record information corresponding to matching records. The
purrr:map()
function is used to apply the npi_search()
function on each
element of the vector. Thereafter, the dplyr::bind_rows()
function is used to combine the list of dataframes together into a
single dataframe.
<- c(1992708929, 1831192848, 1699778688, 1111111111) # Last element doesn't exist
npis
<- npis %>%
out ::map(., ~ npi_search(number = .)) %>%
purrr::bind_rows()
dplyr#> 10 records requested
#> Requesting records 0-10...
#> 10 records requested
#> Requesting records 0-10...
#> 10 records requested
#> Requesting records 0-10...
#> 10 records requested
#> Requesting records 0-10...
npi_summarize(out)
#> # A tibble: 3 × 6
#> npi name enumeration…¹ prima…² phone prima…³
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 1992708929 NOVAMED MANAGEMENT SERVICES LLC Organization 3200 D… 404-… Dentis…
#> 2 1831192848 MATTHEW JAFFE Individual 3672 M… 770-… Orthop…
#> 3 1699778688 STEVEN PARNES Individual <NA> 770-… Clinic…
#> # … with abbreviated variable names ¹enumeration_type,
#> # ²primary_practice_address, ³primary_taxonomy
Here we search for multiple zip codes in Los Angeles County.
<- c(90210, 90211, 90212)
codes
<- codes %>%
zip_3 ::map(., ~ npi_search(postal_code = .)) %>%
purrr::bind_rows()
dplyr#> 10 records requested
#> Requesting records 0-10...
#> 10 records requested
#> Requesting records 0-10...
#> 10 records requested
#> Requesting records 0-10...
npi_flatten(zip_3)
#> # A tibble: 86 × 47
#> npi basic…¹ basic…² basic…³ basic…⁴ basic…⁵ basic…⁶ basic…⁷ basic…⁸ basic…⁹
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 1063… RICHARD ROGAL A PhD NO M 2006-0… 2007-0… A
#> 2 1063… RICHARD ROGAL A PhD NO M 2006-0… 2007-0… A
#> 3 1063… RICHARD ROGAL A PhD NO M 2006-0… 2007-0… A
#> 4 1063… RICHARD ROGAL A PhD NO M 2006-0… 2007-0… A
#> 5 1063… MINOO MAHMOU… <NA> MD YES F 2019-0… 2019-0… A
#> 6 1063… MINOO MAHMOU… <NA> MD YES F 2019-0… 2019-0… A
#> 7 1093… FRED EMMANU… FERAYD… DDS NO M 2007-0… 2007-0… A
#> 8 1093… FRED EMMANU… FERAYD… DDS NO M 2007-0… 2007-0… A
#> 9 1104… GILBERT KWONG <NA> D.D.S. YES M 2016-0… 2016-0… A
#> 10 1104… GILBERT KWONG <NA> D.D.S. YES M 2016-0… 2016-0… A
#> # … with 76 more rows, 37 more variables: basic_name_prefix <chr>,
#> # basic_name_suffix <chr>, basic_organization_name <chr>,
#> # basic_organizational_subpart <chr>,
#> # basic_authorized_official_first_name <chr>,
#> # basic_authorized_official_last_name <chr>,
#> # basic_authorized_official_telephone_number <chr>,
#> # basic_authorized_official_title_or_position <chr>, …
Consult the R for Data Science chapter on iteration to
learn more about using the purrr
package.
Alternatively, you can use a simple for loop instead if you are unfamiliar with the tidyverse approach.
<- c(1992708929, 1831192848, 1699778688, 1111111111) # Last element doesn't exist
npis <- data.frame()
combined_df for (i in npis) {
<- rbind(combined_df, npi_search(number = i))
combined_df
}#> 10 records requested
#> Requesting records 0-10...
#> 10 records requested
#> Requesting records 0-10...
#> 10 records requested
#> Requesting records 0-10...
#> 10 records requested
#> Requesting records 0-10...
npi_summarize(combined_df)
#> # A tibble: 3 × 6
#> npi name enumeration…¹ prima…² phone prima…³
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 1992708929 NOVAMED MANAGEMENT SERVICES LLC Organization 3200 D… 404-… Dentis…
#> 2 1831192848 MATTHEW JAFFE Individual 3672 M… 770-… Orthop…
#> 3 1699778688 STEVEN PARNES Individual <NA> 770-… Clinic…
#> # … with abbreviated variable names ¹enumeration_type,
#> # ²primary_practice_address, ³primary_taxonomy