*Purpose: This program creates the dataset for the hospital staffing analysis. *Author: Martha Johnson *Date: 1/13/2020 ** Contents ** *1) Import and save data that will be merged into the hospital survey data *2) Import and prepare hospital survey data *3) Identify and replace missings and outliers *set directories global input "U:\NY Nurse Project\input" global output "U:\NY Nurse Project\output" ** 1) Import and save data that will be merged into the hospital survey data ** *import wage data import excel using "$input\wages", clear firstrow *save file save "$input\wages", replace *import number of beds at each hospital import excel using "$input\Hosp_BedOccupancy_2016-2017", /// case(lower) allstring clear firstrow *destring where possible destring *, replace *save file save "$input\hospital_beds", replace *import facility county code info import excel using "$input\hist_provider_data_Hosp&NH_2016_2018", /// case(lower) allstring clear firstrow *destring where possible destring *, replace *only need county name and county code from this file bysort facility_county_code: keep if _n==1 keep county_name facility_county_code *change saint lawrence to match replace county_name="St.Lawrence" if county_name=="Saint Lawrence" *save file save "$input\county_codes", replace ** 2) Import and prepare hospital survey data ** *import survey data import excel using "$input\FacilityDetailReport_3-13-2019_mcj_changed_varnames", /// cellrange(A2:FI217) case(lower) allstring clear firstrow *clean up string characters before destringing foreach var of varlist * { replace `var'="" if `var'=="n/a" | `var'=="N/A" replace `var'=regexr(`var',",","") } *destring destring *, replace *drop blank variables, just category headers in the excel spreadsheet missings dropvars, force /*note: surveypointofcontact hospitalunits neonatalinpatientunit pediatricinpatientunit adultinpatientunit psychiatricpatientunit rehabinpatientunits otherpatienttypeunits dropped*/ *rename variables rename leveliiiiv_critical_care_pat_da leveliiiiv_crit_care_pat_days rename step_down_and_telemetry_pat_days stepdn_telemtry_peds_pat_days rename step_down_telemetry_adult_pat_da step_dn_telemtry_adult_pat_days rename behavioral_health_chemical_pat_d behavl_healthchemicl_pat_days rename stepdn_telemetry_peds_lns stepdn_telemetry_peds_lpns rename behavl_healthchemical_rns behavl_healthchemicl_rns rename childadolescent_pat_days childadolest_pat_days rename leveli_continuing_care_pat_days leveli_contg_care_pat_days rename levelii_intiate_pat_days levelii_int_pat_days rename medsurg_combined_adult_pat_days medsurg_combd_adult_pat_days rename medsurg_combined_pat_days medsurg_peds_combd_pat_days rename other_psych_unit_pat_days other_unit_type_psy_pat_days rename other_unit_type_psych_rns other_unit_type_psy_rns rename surgical_trauma_pat_days pediatric_surgical_pat_days rename surgicaltrauma_adult_pat_days surgical_adult_pat_days rename medical_pat_days medical_peds_pat_days rename rehab_ad_mixed_acuity_pat_days rehab_mixd_acuity_ad_pat_days rename rehab_peds_mixd_acuity_rns rehab_ped_mixd_acuity_rns rename rehab_peds_mixed_acuity_pat_days rehab_ped_mixd_acuity_pat_days rename specialty_pat_days specialty_psych_pat_days rename step_dn_telemetry_adult_rns step_dn_telemtry_adult_rns rename stepdn_telemetry_peds_lpns stepdn_telemtry_peds_lpns rename stepdn_telemetry_peds_rns stepdn_telemtry_peds_rns rename childadolest_pat_days childadolest_psych_pat_days rename other_ambulatory_pat_days other_ambul_care_pat_days rename other_peri_operative_rn other_peri_operative_rns *merge in number of hospital beds clonevar fac_facid=reportingorganizationid merge 1:1 fac_facid using "$input\hospital_beds" drop if _m==2 drop _m *follow DOH's replacements for the 11 hospitals not in the hospital bed dataset replace beds=0 if fac_facid==306 replace beds=0 if fac_facid==309 replace beds=0 if fac_facid==9753 replace beds=103 if fac_facid==698 replace beds=25 if fac_facid==10216 replace beds=0 if fac_facid==1185 replace beds=40 if fac_facid==1465 replace beds=2 if fac_facid==1723 replace beds=0 if fac_facid==8554 replace beds=0 if fac_facid==9691 replace beds=2 if fac_facid==9700 *merge in county codes clonevar county_name=county merge m:1 county_name using "$input\county_codes" drop if _m==2 drop _m *assign regions gen phipregion="" replace phipregion="Capital Region" if inlist(facility_county_code, 1, 21, 39, 91, 93, 83) replace phipregion="Central NY" if inlist(facility_county_code, 11, 23, 53, 65, 67, 75) replace phipregion="Finger Lakes" if inlist(facility_county_code, 15, 51, 55, 69, 97, 99, 101, 117, 123) replace phipregion="Long Island" if inlist(facility_county_code, 59, 103) replace phipregion="Mid-Hudson" if inlist(facility_county_code, 27, 71, 79, 87, 105, 111, 119) replace phipregion="Mohawk Valley" if inlist(facility_county_code, 35, 43, 57, 77, 95) replace phipregion="New York City" if inlist(facility_county_code, 5, 47, 61, 85, 81) replace phipregion="North Country" if inlist(facility_county_code, 19, 31, 33, 113, 115) replace phipregion="Southern Tier" if inlist(facility_county_code, 7, 17, 25, 107, 109) replace phipregion="Tug Hill Seaway" if inlist(facility_county_code, 45, 49, 89) replace phipregion="Western NY" if inlist(facility_county_code, 3, 9, 13, 29, 63, 73, 37, 121) *DOH did not include any of the other categories apart from emergency department *Drop other, ambulatory care; other, interventional; and other, peri-operative drop other_ambul* drop other_interventional* drop other_peri* ** 3) Identify and replace missings and outliers ** Missings ** *Create variables for patient volume and nursing hours, for each service unit, treating missings as zero *Then, for each service unit, count as missing if one is zero but the other is positive foreach prefix in adolescent_psych adult_mixed_acuity adult_psych adult_rehab behavl_healthchemicl child_psych childadolest_psych critical_care_adult critical_care_peds geropsych leveli_contg_care levelii_int leveliiiiv_crit_care medical_adult medical_peds medsurg_combd_adult medsurg_peds_combd multiple_unit_types neonate_mixed_acuity obstetrics other_ed other_unit_type_psy pediatric_rehab pediatric_surgical peds_mixed_acuity rehab_mixd_acuity_ad rehab_ped_mixd_acuity skilled_nursing specialty_psych step_dn_telemtry_adult stepdn_telemtry_peds surgical_adult well_baby_nursery { gen `prefix'_mi=0 egen `prefix'_nh=rowtotal(`prefix'_rns `prefix'_lpns) egen `prefix'_pv=rowtotal(`prefix'_pat_days) replace `prefix'_mi=1 if (`prefix'_nh==0 & `prefix'_pv>0) | (`prefix'_nh>0 & `prefix'_pv==0) } *drop three hospitals that contribute no data drop if inlist(reportingorganizationid, 414, 698, 1465) *Impute missing values using state-wide medians, adjusting by number of beds *median number of beds sum beds, d local beds_median=r(p50) display `beds_median' *calculate medians of nursing hours and patient volume, then make adjustments if missing foreach prefix in adolescent_psych adult_mixed_acuity adult_psych adult_rehab behavl_healthchemicl child_psych childadolest_psych critical_care_adult critical_care_peds geropsych leveli_contg_care levelii_int leveliiiiv_crit_care medsurg_combd_adult medical_adult surgical_adult medical_peds medsurg_peds_combd multiple_unit_types neonate_mixed_acuity obstetrics other_ed other_unit_type_psy pediatric_rehab pediatric_surgical peds_mixed_acuity rehab_mixd_acuity_ad rehab_ped_mixd_acuity skilled_nursing specialty_psych step_dn_telemtry_adult stepdn_telemtry_peds well_baby_nursery { foreach type in nh { quietly sum `prefix'_`type' if `prefix'_nh>0 & `prefix'_mi==0, d local `prefix'_`type'm=r(p50) display ``prefix'_`type'm' count if `prefix'_`type'==0 & `prefix'_mi==1 replace `prefix'_`type'=``prefix'_`type'm'*(beds/`beds_median') if `prefix'_`type'==0 & `prefix'_mi==1 replace `prefix'_`type'=0 if `prefix'_`type'==. } foreach type in pv { quietly sum `prefix'_`type' if `prefix'_pv>0 & `prefix'_mi==0, d local `prefix'_`type'm=r(p50) display ``prefix'_`type'm' count if `prefix'_`type'==0 & `prefix'_mi==1 replace `prefix'_`type'=``prefix'_`type'm'*(beds/`beds_median') if `prefix'_`type'==0 & `prefix'_mi==1 replace `prefix'_`type'=0 if `prefix'_`type'==. } } ** Outliers ** *Define outliers for each service unit if the ratio of patient volume to nursing hours is more than 2 standard deviations from the median of that ratio. *Replace nursing hours and patient volume for those outlier hospitals-service-units using median (now calculated after excluding outliers) adjusted by number of beds. foreach prefix in adolescent_psych adult_mixed_acuity adult_psych adult_rehab behavl_healthchemicl child_psych childadolest_psych critical_care_adult critical_care_peds geropsych leveli_contg_care levelii_int leveliiiiv_crit_care medical_adult medical_peds medsurg_combd_adult medsurg_peds_combd multiple_unit_types neonate_mixed_acuity obstetrics other_ed other_unit_type_psy pediatric_rehab pediatric_surgical peds_mixed_acuity rehab_mixd_acuity_ad rehab_ped_mixd_acuity skilled_nursing specialty_psych step_dn_telemtry_adult stepdn_telemtry_peds surgical_adult well_baby_nursery { gen `prefix'_ratio=`prefix'_pv/`prefix'_nh egen `prefix'_sd=sd(`prefix'_ratio) egen `prefix'_p50=median(`prefix'_ratio) gen `prefix'_sds=(abs(`prefix'_ratio-`prefix'_p50))/`prefix'_sd gen `prefix'_ol2=`prefix'_sds>2 & !mi(`prefix'_sds) quietly sum `prefix'_nh if `prefix'_ol2==0, d local `prefix'_nhm=r(p50) replace `prefix'_nh=``prefix'_nhm'*(beds/`beds_median') if `prefix'_ol2==1 quietly sum `prefix'_pv if `prefix'_ol2==0, d local `prefix'_pvm=r(p50) replace `prefix'_pv=``prefix'_pvm'*(beds/`beds_median') if `prefix'_ol2==1 } *drop variables to re-generate capture drop *_ratio *_sd *_sds *_p50 *Define outliers for each service unit if the ratio of patient volume to nursing hours is more than 3 standard deviations from the median of that ratio. *Replace nursing hours and patient volume for those outlier hospitals-service-units using median (now calculated after excluding outliers) adjusted by number of beds. foreach prefix in adolescent_psych adult_mixed_acuity adult_psych adult_rehab behavl_healthchemicl child_psych childadolest_psych critical_care_adult critical_care_peds geropsych leveli_contg_care levelii_int leveliiiiv_crit_care medical_adult medical_peds medsurg_combd_adult medsurg_peds_combd multiple_unit_types neonate_mixed_acuity obstetrics other_ed other_unit_type_psy pediatric_rehab pediatric_surgical peds_mixed_acuity rehab_mixd_acuity_ad rehab_ped_mixd_acuity skilled_nursing specialty_psych step_dn_telemtry_adult stepdn_telemtry_peds surgical_adult well_baby_nursery { gen `prefix'_ratio=`prefix'_pv/`prefix'_nh egen `prefix'_sd=sd(`prefix'_ratio) egen `prefix'_p50=median(`prefix'_ratio) gen `prefix'_sds=(abs(`prefix'_ratio-`prefix'_p50))/`prefix'_sd gen `prefix'_ol3=`prefix'_sds>3 & !mi(`prefix'_sds) quietly sum `prefix'_nh if `prefix'_ol3==0, d local `prefix'_nhm=r(p50) replace `prefix'_nh=``prefix'_nhm'*(beds/`beds_median') if `prefix'_ol3==1 quietly sum `prefix'_pv if `prefix'_ol3==0, d local `prefix'_pvm=r(p50) replace `prefix'_pv=``prefix'_pvm'*(beds/`beds_median') if `prefix'_ol3==1 } *save dataset save "$input\hospital_data", replace