Australia has the highest incidence of skin cancer in the world. Two out of three Australians are diagnosed with skin cancer in their lifetime,^{1} most of which is managed surgically. Surgical site infection (SSI) following dermatological surgery can be associated with prolonged wound healing, lengthened recovery time, poorer cosmetic outcomes and increased healthcare costs,^{2} although in most cases, the consequences are minor and can be resolved with a short course of therapeutic antibiotics. Current antibiotic guidelines do not recommend routine antibiotic prophylaxis following dermatological surgery; however, doctors might consider judicial prophylaxis under highrisk circumstances.^{3} Given the emerging antibiotic resistance,^{4} identifying patients at higher risk of infection is necessary to risk stratify and encourage judicious antibiotic prescribing. Reducing bacterial load through preoperative washing with chlorhexidine and nasal mupirocin is an alternative to antibiotic use,^{5} and risk stratification could help to select patients for this more conservative intervention.
An understanding of patient and procedural risk factors is necessary to accurately define groups predisposed to developing an SSI. In most cases, with carefully planned surgery and diagnostic accuracy,^{6} these predictors are established prior to surgery. Published studies identify the age and sex of the patient and the histology, location and complexity of the excision as risk factors for SSI.^{7–13} However, several of these studies had limited power because of a low incidence of infection and small sample sizes, resulting in few outcomes. A higher proportion of dermatological surgery takes place in general practice than in hospital or specialist clinic settings in Australia,^{14} but this setting is not well represented in the literature. The incidence of infection following clean minor surgery in an Australian general practice population in the absence of antibiotic prophylaxis has consistently been demonstrated to be higher than the accepted rate of 5%,^{10,11,15} making this population ideal for studying a relatively rare outcome such as infection.
The aim of this study was to develop a clinical prediction rule to identify patients at risk of developing SSI who might benefit from prophylactic antimicrobial therapy, thus promoting the judicious use of antibiotics while reducing patient morbidity.
Methods
To identify the risk factors for SSI in a large cohort, we combined individual participant data from four randomised controlled trials (RCTs) conducted in a regional centre in North Queensland, Australia.^{16–19} Detailed methods for combining this prospectively collected data have been described previously.^{20} In brief, patients over the age of 18 years presenting to one of four general practices for removal of a skin lesion who were not currently on immunosuppressant medication or prophylactic antibiotics (or prescribed antibiotics immediately postsurgery) were eligible. Surveillance criteria for superficial SSI were strictly followed.^{21} A matrix of variables common to all studies was produced and included patient, lesion and excision characteristics.
Statistical methods
Statistical analysis was performed using STATA software (v16; StataCorp, College Station, TX, USA) and SAS OnDemand (2023; SAS Institute, Cary, USA). A Pvalue of less than 0.05 was considered to signify statistical significance.
Description of baseline variables
Data were initially examined at baseline (Table 1). For descriptive analysis of categorical data, absolute and relative frequencies were calculated. The incidence of SSI was presented with the 95% confidence interval (CI). Numerical data were first assessed for normality using histograms, summary statistics and the Shapiro–Wilk test of normality. Both numerical variables in the dataset (age and length of excision) were skewed and were described using the median, interquartile range and range. For inferential analysis comparing patients with SSI and patients with no SSI, chisquared and Fisher’s exact tests were applied for categorical variables, and Wilcoxon ranksum tests were used for numerical data.
Table 1. Patient and excision characteristics 
Characteristic 
Overall (n=3819) 
No SSI (n=3521) 
SSI (n=298) 
SSI rate (%) 
Pvalue 
Age (IQR), range (n=3794) 
63 (50, 73), range 5 to 101 
62 (49, 73); range 5 to 101 
68 (58, 75); range 15 to 91 
– 
<0.001 
Age >55 years 
2548 
2305 
243 
9.5 
<0.001 
Male (%) 
2095 (54.9) 
1912 (54.3) 
183 (61.4) 
8.7 
0.018 
Medical conditions (%) 
Any condition 
520 (13.6) 
462 (13.1) 
58 (19.5) 
10.6 
0.002 
Anaemia^{4} (n=478) 
1 (0) 
1 (0) 
0 (0) 
0.0 
1.0 
Cancer^{1,3,4} (n=2572) 
50 (1.9) 
39 (1.6) 
11 (5.3) 
22.0 
<0.001 
COPD (n=3752) 
62 (1.6) 
53 (1.5) 
9 (3.0) 
14.5 
0.047 
Diabetes (n=3818) 
285 (7.5) 
252 (7.2) 
33 (11.1) 
11.6 
0.014 
Hypertension^{4} (n=478) 
119 (24.9) 
113 (26.0) 
6 (14.0) 
5.0 
0.082 
Ischaemic heart disease^{3,4} (n=1663) 
45 (2.7) 
39 (2.6) 
6 (4.0) 
13.0 
0.306 
Inflammatory skin disease^{1} (n=909) 
2 (0.2) 
2 (0.2) 
0 (0) 
0.0 
1.0 
Peripheral vascular disease 
19 (0.5) 
18 (0.5) 
1 (0.3) 
5.3 
1.0 
Medications (%) 
Any medication 
602 (15.8) 
538 (15.3) 
64 (21.5) 
10.6 
0.005 
Anticoagulants^{1,3,4} (n=2572) 
204 (7.9) 
185 (7.8) 
19 (9.2) 
9.3 
0.489 
Antiplatelet 
329 (8.6) 
292 (8.3) 
37 (12.4) 
11.3 
0.015 
Daily inhaled steroids^{1,3,4} (n=2570) 
66 (2.6) 
60 (2.5) 
6 (2.9) 
9.1 
0.754 
Immunosuppressants^{1} (n=909) 
12 (1.3) 
10 (1.2) 
2 (3.5) 
16.0 
0.135 
Opioids^{1} (n=909) 
8 (0.9) 
8 (0.9) 
0 (0) 
0.0 
1.0 
Oral steroids^{1,3,4} (n=2572) 
45 (1.7) 
39 (1.6) 
6 (2.9) 
13.3 
0.189 
Diseasemodifying antirheumatic drugs^{4} (n=478) 
2 (0.4) 
2 (0.5) 
0 (0) 
0.0 
1.0 
Smoking status (%)^{1,3,4} (n=2549) 




0.250 
Nonsmoker 
1579 (61.9) 
1464 (62.4) 
115 (56.7) 
7.3 

Exsmoker 
678 (26.6) 
615 (26.2) 
63 (31.0) 
9.3 

Current smoker 
292 (11.5) 
267 (11.4) 
25 (12.3) 
8.6 

Histology (%) (n=3818) 




<0.001 
Benign 
1151 (30.1) 
1117 (31.7) 
34 (11.4) 
2.1 

Premalignant 
814 (21.3) 
746 (21.2) 
68 (22.8) 
8.4 

Malignant 
1853 (48.5) 
1,657 (47.1) 
196 (65.8) 
10.6 

Site of lesion (%) (n=3794) 




<0.001 
Head and neck 
894 (23.6) 
864 (24.5) 
30 (10.1) 
3.4 

Upper limbs 
1269 (33.4) 
1135 (32.2) 
134 (45) 
10.6 

Torso 
667 (17.6) 
633 (18) 
34 (11.4) 
5.1 

Upper leg 
437 (11.5) 
407 (11.6) 
30 (10.1) 
6.9 

Below knee 
552 (14.5) 
482 (13.7) 
70 (23.5) 
12.7 

Excision characteristics 
Excision length^{1,3,4} (IQR); range (n=2572) 
20 (15, 30); range 1.5 to 100 
20 (14, 30); range 1.5 to 100 
27 (20, 38); range 6 to 80 
– 
<0.001 
Excision length >2 cm 
2761 
2502 
259 
9.4 
<0.001 
Flap (%) (n=3815) 
54 (1.4) 
39 (1.1) 
15 (5.0) 
27.8 
<0.001 
Description of patient and excision characteristics of 3819 patients undergoing minor skin excision and comparisons between patients with and without SSI. The data combine results from four clinical trials. Not all characteristics were assessed in all trials; the trial number and/or sample sizes are stated for variables with fewer than 3819 valid entries; age (years) are presented as the median. The denominator for ‘any’ condition or medication mentioned in pooled data combines ‘no’ and ‘missing values’ for trials that did not record certain conditions or medications. Superscript numbers adjacent to variables denote which trial the variable was recorded in. No number (–) indicates that the variable was recorded in all four trials. Excision length (mm) is presented as the median.
COPD, chronic obstructive pulmonary disease; IQR, interquartile range; SSI, surgical site infection. 
Selection of variables
After baseline analysis, the data were modified to enable utility for clinical prediction rules.
The rationale for modification was based on the bivariate analysis of risk factors for infection in the baseline data, clinical knowledge of risk factors for infection, and pragmatism and parsimony with the objective of developing a clinical prediction rule (Box 1).
Box 1. Data modifications 
Continuous variables were converted to categorical data:
 Age was collapsed into two categories: <55 years or >55 years
 Excision length was collapsed into two categories: <2 cm or >2 cm

Categorical variables with sparse subgroups were collapsed into larger groups:
 Lesion site was collapsed into five regions: below the knee, upper leg, torso, upper limbs, head and neck. Histology was collapsed into three groups: benign, premalignant and malignant

Variables with >50% data or considered clinical priority were included, with dummy variables created for missing data:
 Dummy variables were created for smoking, diabetes, peripheral vascular disease and excision length

Age was collapsed into two categories: <55 years or >55 years:
 All medications and medical conditions, with the exception of diabetes and peripheral vascular disease, were excluded

Identifying predictors of infection
The model had capacity for 29 predictors (events/10). The bivariate analysis (Table 1), completeness of data and parsimony were considered to select variables. The final categories used for input variables were age >55 years, sex, histology (three categories), lesion site (five categories), excision type, excision length, smoking, diabetes and peripheral vascular disease. Categorical characteristics were then entered into a logistic regression prediction model, with SSI as the dependent variable. Stepwise forward and backward selection processes were conducted to reach a model. Regression coefficients and odds ratios with 95% CIs were calculated.
Bootstrapping was used as an internal validation step to assess the final model. One thousand bootstrap samples were drawn from the original data sample as proxies for samples from the underlying population. The prediction model was fitted to each bootstrap sample and tested on the original sample. To adjust for overfitting, we intended to multiply the original regression coefficients by the shrinkage factor obtained by bootstrapping.
Performance and validation of the model
The maximum likelihood ratio and pseudo R^{2} (1−L(0)/L(B]^{2/n}) and the maximum rescaled R^{2} (R^{2}/[1−{L(0)}^{2/n}]) were calculated (likelihood with all covariates compared with likelihood with intercept only). The Akaike information criterion (AIC), including covariates, was compared with the intercept only (−2 loglikelihood +2p). The area under the curve (AUC) and the Cstatistic were used to evaluate model discrimination. We calculated the Homer and Lemeshow goodness of fit. Bootstrapping was performed for calibration.
Developing clinical prediction rules
Initially, the probability of infection was calculated for each patient in the dataset. The sensitivity, specificity, positive predictive value, negative predictive value and number needed to treat (NNT) were calculated for different cutoffs for infection probabilities. We then assigned points for each risk factor. Points for each patient were summed to generate a score. We analysed the net benefit of the prediction model by weighing up the numbers and implications of true and false positive and negative diagnoses and comparing the results with alternative strategies of either treating all or no patients with antibiotics.
Ethics
This study incorporated data from four RCTs conducted by the lead author, all of which were approved by the author’s institutional human research ethics committee (approval numbers: H4572, H2590, H6065 and H1902).
Results
Data from 3819 patients were available for analysis. Patient characteristics and clinical details regarding the excisions are presented in Table 1. The median age of the 3819 patients was 63 years, with an even distribution of men to women (54.9% vs 45.1%). The final model included age (>55 years), premalignant and malignant histology (with benign as the reference), body sites (with face, scalp and neck as the reference) and complicated excisions (Table 2).
Table 2. Result of logistic regression modelling of risk factors for SSI in 3787 patients with minor skin surgery 
Characteristic 
Before bootstrapping 
After bootstrapping 

Regression coefficient 
OR (95% CI) 
Regression coefficient 
OR (95% CI) 
Intercept 
−4.8 
0.0081 
−4.8 
0.0081 
Age >55 years 
0.54 
1.72 (1.25–2.37) 
0.54 
1.72 (1.25–2.36) 
Histology of lesion 
Benign 
Referent 

Referent 

Premalignant 
1.04 
2.84 (1.79–4.50) 
1.04 
2.84 (1.75–4.58) 
Malignant 
1.18 
3.26 (2.15–4.9) 
1.18 
3.26 (2.08–5.08) 
Site of lesion 
Face, neck, scalp 
Referent 

Referent 

Below knee 
1.67 
5.30 (3.37–8.33) 
1.67 
5.3 (3.33–8.42) 
Upper limbs 
1.11 
3.04 (2.02–4.60) 
1.11 
3.04 (2.02–4.59) 
Torso 
0.97 
2.64 (1.57–4.43) 
0.97 
2.64 (1.52–4.58) 
Upper leg 
0.94 
2.56 (1.47–4.47) 
0.94 
2.56 (1.45–4.52) 
Type of excision 
Simple excision 
Referent 

Referent 

Flap 
1.46 
4.33 (2.29–8.20) 
1.46 
4.33 (2.34–8.01) 
CI, confidence interval; OR, odds ratio; SSI, surgical site infection. 
Performance and validation of the model
The AIC intercept only was 2094.3, and the intercept with covariates was 1953.6 (indicating that the value of the model was better than with no predictors). The likelihood ratio was chisquare 156.7 (P<0.001). The pseudo R^{2} value was 0.0747, and the maximum rescaled R^{2} value was 0.10. The AUC was 0.704, corresponding with the Cstatistic of 0.704, which could be considered good discrimination. The Homer and Lemeshow goodness of fit had a chisquare value of 4.25 (degrees of freedom 8, P=0.834). Bootstrapping showed little shrinkage of the final model (Table 2), and no adjustment was performed.
Assessment of utility of clinical prediction rules
The predicted probability for each patient ranged from 0.8% to 51%. Table 3 shows the net benefits of cutoffs across a range of different probabilities and the corresponding clinical prediction rule score. Table 4 shows our suggested clinical prediction rule based on the different probabilities.
Table 3. Sensitivity, specificity, PPV and NPV, and cutoff values of the predicted probability and corresponding prediction score 
Predicted probability 
Prediction score 
Sensitivity
(%) 
Specificity (%) 
PPV (%) 
NPV (%) 
Number treated with antibiotics 
Number of infections prevented 
NNT 
0 
0+ 
100 
0 
7.79 
– 
3815 
298 
12.80 
0.025 
2+ 
95.96 
18.82 
9.07 
98.22 
3141 
285 
11.00 
0.05 
3+ 
83.50 
46.75 
11.69 
97.11 
2121 
248 
8.55 
0.1 
4+ 
62.96 
66.65 
13.75 
92.50 
1360 
187 
7.27 
0.11 
5+ 
48.48 
75.13 
14.23 
94.53 
1019 
144 
7.10 
0.125 
7+ 
22.56 
92.69 
20.68 
93.40 
324 
67 
4.80 
0.15 
7+ 
22.56 
92.70 
20.74 
93.41 
323 
67 
4.80 
0.2 
9+ 
4.38 
99.43 
39.39 
92.49 
33 
13 
2.52 
0.25 
9+ 
4.38 
99.43 
39.39 
92.49 
33 
13 
2.52 
0.3 
9+ 
3.37 
99.60 
41.67 
92.43 
24 
10 
2.40 
0.35 
10+ 
3.03 
99.66 
42.86 
92.41 
21 
9 
2.30 
0.45 
11+ 
0.67 
99.91 
40 
92.26 
5 
2 
2.50 
1.0 
11+ 
0 
100 
– 
92.21 
0 
0 
– 
NNT, numberneededtotreat; NPV, negative predictive value; PPV, positive predictive value. 
Table 4. Clinical decision rules for general practitioners to predict the probability of infection and guide prophylactic treatment 
Assign points 
Category 
Criteria 
Points 
Patient age 
>55 years 
1 
≤55 years 
0 
Anatomical excision site 
Below knee 
4 
Arm 
3 
Upper leg 
2 
Torso 
2 
Head and neck 
0 
Expected histology of lesion 
Premalignant 
2 
Malignant 
2 
Benign 
0 
Complexity of surgery required for excision 
Flap 
4 
Simple ellipse excision 
0 
Total score 

0–11 
Interpretation of the score 

Score 
Predicted probability of infection 
Considerations 

7+ 
>15% 
Bacterial decontamination with chlorhexidine wash, screening for MRSA with swab, and nasal mupirocin 
9+ 
>25% 
In addition to the above, consider antibiotic prophylaxis 
MRSA, methicillinresistant Staphylococcus aureus. 
Discussion
Our clinical prediction rule, developed using a large cohort of patients at general practice clinics in North Queensland, included details of patient age, lesion histology, lesion site and excision type (Table 2). These details are readily available and can be entered into the model prior to surgery to indicate patients at high risk of SSI and, consequently, whether prophylaxis (bacterial load reduction or antibiotics) is warranted.
The net benefit of a clinical prediction rule depends on the ‘cost’ of falsepositive versus falsenegative diagnosis. The cost depends on the negative consequences of the intervention (antibiotic resistance and side effects) and assumes that prophylaxis is 100% effective. For antibiotic prophylaxis, extremes would be to treat everyone for infection (which would prevent infection in 7.8% of the cohort) compared with treating nobody for infection (which would result in a 7.8% infection rate, which might be tolerable considering the concerns of antibiotic resistance, and the relatively low morbidity cost of surgical site infection).
We would like to suggest two possible scenarios. First, we suppose that the cutoff is a probability of 0.15 (which corresponds to a clinical prediction score of 7+). This will result in treating 323 patients (<10% of the patient cohort) with antibiotics to prevent 67 infections resulting in an NNT of 4.8. The second is to choose a cutoff of 0.25 (corresponding to a clinical prediction score of 9+), with 33 patients (<1%) treated to prevent 13 infections and an NNT of 2.52. Alternatively, doctors might choose a higher cutoff point for the intervention of reducing bacterial load, where the result of falsepositive diagnoses has fewer clinical implications. In choosing a cutoff point, doctors must weigh up the probability of infection with the consequences of morbidity resulting from an infection in an individual patient and morbidity from prophylaxis.
Strengths and limitations
Although bootstrapping demonstrated good calibration, it is likely that the model is overfitted to a general practice patient population in Queensland, Australia, where infection rates are high due to environmental and patient factors. Despite this limitation, we hope that the model will help doctors in Australia with clinical decision making and reduce antibiotic prescribing.
Although we have no reason to believe that our model would not be predictive in environments with lower infection rates, the probability of infection would need to be muted, and our model is primarily designed for use in settings with similar infection rates. The model showed modest discrimination only. There was insufficient data regarding some potential risk factors for infection, such as immunosuppression and skin lesion ulceration, to be included in the model.
The strengths of our study are the large sample size and the large number of outcomes of SSI, which allowed us to develop our prediction model. The dataset also benefits from the prospective collection of data by the same investigator studying a similar patient group. Therefore, the clinical and methodological heterogeneity of the data can be assumed to be small. The current prediction model is based on a scoring system but could be adapted to be used in electronic medical records.
Clinical implementation of our prediction rule in general practice could quickly identify patients at high risk of developing SSI who could benefit from prophylactic bacterial load reduction or antibiotic treatment. By limiting the use of antimicrobial therapies to highrisk patients, unnecessary use of these treatments can be avoided that could lead to a significant reduction in antibiotic use given the high number of skin cancer surgeries conducted in this setting in Australia. Future studies using data from diverse geographical sites are warranted to further test and refine the model and investigate its generalisability to the Australian population.