SQL 101 in R

· by Nicholas Horton · Read in about 30 min · (6341 words) ·

The NASEM Data Science for Undergraduates report noted that the storage, preparation, and accessing of data is at the heart of data science and that students need to directly experience multiple forms of data, including the use of databases.

SQL (pronounced sequel) stands for Structured Query Language; it is a language designed to manage data in a relational database system. The papers https://chance.amstat.org/2015/04/setting-the-stage and https://chance.amstat.org/2015/04/databases/ provide a high level overview of database systems.

We will use a public facing MySQL database containing wideband acoustic immittance (WAI) measures made on normal ears of adults. (The project is funded by the National Institutes of Health, NIDCD, and hosted on a server at Smith College, PI Susan Voss, R15 DC014129-01.) The database was created to enable auditory researchers to share WAI measurements and combine analyses over multiple datasets.

We begin by demonstrating how SQL queries can be sent to a database. We begin by setting up a connection using the dbConnect() function.

library(mosaic)
library(RMySQL) # that there are plans to move this support to RMariaDB 
con <- dbConnect(
  MySQL(), host = "scidb.smith.edu", user = "waiuser", 
  password = "smith_waiDB", dbname = "wai")

Next a series of SQL queries can be sent to the database using the DBI::dbGetQuery() function: each query returns an R dataframe.

class(dbGetQuery(con, "SHOW TABLES"))
## [1] "data.frame"

There are multiple tables within the wai database.

dbGetQuery(con, "SHOW TABLES")
##       Tables_in_wai
## 1          Codebook
## 2      Measurements
## 3 Measurements_2020
## 4           PI_Info
## 5       PI_Info_OLD
## 6           Subject
## 7     Subjects_2020

The EXPLAIN command describes the ten field names (variables) in the PI_Info table.

dbGetQuery(con, "EXPLAIN PI_Info")
##                 Field        Type Null Key Default Extra
## 1          Identifier varchar(50)   NO PRI    <NA>      
## 2                Year     int(11)   NO        <NA>      
## 3             Authors        text   NO        <NA>      
## 4    AuthorsShortList        text   NO        <NA>      
## 5               Title        text   NO        <NA>      
## 6             Journal        text   NO        <NA>      
## 7                 URL        text   NO        <NA>      
## 8            Abstract        text   NO        <NA>      
## 9   DataSubmitterName        text   NO        <NA>      
## 10 DataSubmitterEmail        text   NO        <NA>      
## 11      DateSubmitted        text   NO        <NA>      
## 12           PI_Notes        text   NO        <NA>

The SELECT statement can be used to select all fields for eight observations in the Measurements table.

eightobs <- dbGetQuery(con, "SELECT * FROM Measurements LIMIT 8")
eightobs
##   Identifier Sub_Number Session Left_Ear MEP Instrument    Freq Absorbance
## 1  Abur_2014          1       1        0  -5          1 210.938  0.0451375
## 2  Abur_2014          1       1        0  -5          1 234.375  0.0441247
## 3  Abur_2014          1       1        0  -5          1 257.812  0.0495935
## 4  Abur_2014          1       1        0  -5          1 281.250  0.0516088
## 5  Abur_2014          1       1        0  -5          1 304.688  0.0590836
## 6  Abur_2014          1       1        0  -5          1 328.125  0.0628038
## 7  Abur_2014          1       1        0  -5          1 351.562  0.0682962
## 8  Abur_2014          1       1        0  -5          1 375.000  0.0738373
##        Zmag      Zang Canal_Area
## 1 110638000 -0.228113         NA
## 2 100482000 -0.230561         NA
## 3  90561100 -0.230213         NA
## 4  83515500 -0.230959         NA
## 5  77476800 -0.229652         NA
## 6  71229100 -0.230026         NA
## 7  66615500 -0.229576         NA
## 8  61996200 -0.229327         NA

More interesting and complicated SELECT calls can be used to undertake grouping and aggregation. Here we calculate the sample size for each study

dbGetQuery(con, 
  "SELECT Identifier, count(*) AS NUM FROM Measurements GROUP BY Identifier ORDER BY NUM")
##       Identifier    NUM
## 1       Sun_2016   2604
## 2    Shaver_2013   2880
## 3    Feeney_2017   3162
## 4      Voss_1994   5120
## 5       Liu_2008   5520
## 6    Werner_2010   7935
## 7  Rosowski_2012  14384
## 8      Voss_2010  14880
## 9      Abur_2014  21328
## 10    Groon_2015  35469
## 11  Shahnaz_2006  58776
## 12    Lewis_2015 114716

Accessing a database using dplyr commands

Alternatively, a connection can be made to the server by creating a series of dplyr tbl objects. Connecting with familiar dplyr syntax is attractive because, as Hadley Wickham has noted, SQL and R have similar syntax (but sufficiently different to be confusing).

The setup process looks similar.

Measurements <- tbl(con, "Measurements")
class(Measurements)
## [1] "tbl_MySQLConnection" "tbl_dbi"             "tbl_sql"            
## [4] "tbl_lazy"            "tbl"
PI_Info <- tbl(con, "PI_Info")
Subject <- tbl(con, "Subject")

We explore the PI_Info table using the collect() function used to force computation on the database (and return the results). One attractive aspect of database systems is that they feature lazy evaluation, where computation is optimized and postponed as long as possible.

PI_Info  %>% summarise(total = n())
## # Source:   lazy query [?? x 1]
## # Database: mysql 5.5.58-0ubuntu0.14.04.1-log [waiuser@scidb.smith.edu:/wai]
##   total
##   <dbl>
## 1    12
PI_Info %>% collect() %>% data.frame()   
##       Identifier Year
## 1      Abur_2014 2014
## 2    Feeney_2017 2017
## 3     Groon_2015 2015
## 4     Lewis_2015 2015
## 5       Liu_2008 2008
## 6  Rosowski_2012 2012
## 7   Shahnaz_2006 2006
## 8    Shaver_2013 2013
## 9       Sun_2016 2016
## 10     Voss_1994 1994
## 11     Voss_2010 2010
## 12   Werner_2010 2010
##                                                                                                                                          Authors
## 1                                                                                              Defne Abur, Nicholas J. Horton, and Susan E. Voss
## 2   M. Patrick Feeney,  Douglas H. Keefe, Lisa L. Hunter,  Denis F. Fitzpatrick, Angela C. Garinis, Daniel B. Putterman, and Garnett P. McMillan
## 3                                               Katherine A. Groon, Daniel M. Rasetshwane, Judy G. Kopun, Michael P. Gorga, and Stephen T. Neely
## 4                                                                                                            James D. Lewis and Stephen T. Neely
## 5                                  Yi-Wen Liu, Chris A. Sanford,  John C. Ellison, Denis F. Fitzpatrick,  Michael P. Gorga, and Douglas H. Keefe
## 6  John J. Rosowski, Hideko H. Nakajima, Mohamad A. Hamade, Lorice Mahfoud, Gabrielle R. Merchant, Christopher F. Halpin, and Saumil N. Merchant
## 7                                                                                                                   Navid Shahnaz and Karin Bork
## 8                                                                                                               Mark D. Shaver and Xiao-Ming Sun
## 9                                                                                                                                  Xiao-Ming Sun
## 10                                                                                                               Susan E. Voss and Jont B. Allen
## 11                               Susan E. Voss, Modupe F. Adegoke, Nicholas J. Horton, Kevin N. Sheth, Jonathan Rosand, and Christopher A. Shera
## 12                                                                                          Lynne A. Werner, Ellen C. Levi, and Douglas H. Keefe
##    AuthorsShortList
## 1       Abur et al.
## 2     Feeney et al.
## 3      Groon et al.
## 4   Lewis and Neely
## 5        Liu et al.
## 6   Rosowski et al.
## 7  Shahnaz and Bork
## 8    Shaver and Sun
## 9               Sun
## 10   Voss and Allen
## 11      Voss et al.
## 12    Werner et al.
##                                                                                                                                     Title
## 1                                                                                          Instrasubject variability in power reflectance
## 2       Normative wideband reflectance, equivalent admittance at the tympanic membrane, and acoustic stapedius reflex threshold in adults
## 3                                                                                       Air-leak effects on ear-canal acoustic absorbance
## 4                                                                    Non-invasive estimation of middle-ear input impedance and efficiency
## 5                    Wideband absorbance tympanometry using pressure sweeps: System development and results on adults with normal hearing
## 6                                                         Ear-canal reflectance, umbo velocity, and tympanometry in normal-hearing adults
## 7                                                                       Wideband reflectance norms for caucasian and chinese young adults
## 8  Wideband energy reflectance measurements: Effects of negative middle ear pressure and application of a pressure compensation procedure
## 9                       Wideband acoustic immittance: Normative study and test-retest reliability of tympanometric measurements in adults
## 10                                                               Measurement of acoustic impedance and reflectance in the human ear canal
## 11                                                               Posture systematically alters ear-canal reflectance and DPOAE properties
## 12                                            Ear-canal wideband acoustic transfer functions of adults and two- to nine-month-old infants
##                   Journal
## 1        J Am Acad Audiol
## 2                Ear Hear
## 3                Ear Hear
## 4         J Acoust Soc Am
## 5         J Acoust Soc Am
## 6                Ear Hear
## 7                Ear Hear
## 8         J Acoust Soc Am
## 9  J Speech Lang Hear Res
## 10        J Acoust Soc Am
## 11               Hear Res
## 12               Ear Hear
##                                                                                                                   URL
## 1                                                                        https://www.ncbi.nlm.nih.gov/pubmed/25257718
## 2                                                                        https://www.ncbi.nlm.nih.gov/pubmed/28045835
## 3  https://journals.lww.com/ear-hearing/fulltext/2015/01000/Air_Leak_Effects_on_Ear_Canal_Acoustic_Absorbance.16.aspx
## 4                                                                 https://asa.scitation.org/doi/abs/10.1121/1.4927408
## 5                                                                        https://www.ncbi.nlm.nih.gov/pubmed/19206798
## 6                                                                         http://www.ncbi.nlm.nih.gov/pubmed/21857517
## 7        http://journals.lww.com/ear-hearing/Abstract/2006/12000/Wideband_Reflectance_Norms_for_Caucasian_and.15.aspx
## 8                                                            "\nhttps://asa.scitation.org/doi/full/10.1121/1.4807509"
## 9                                                                        https://www.ncbi.nlm.nih.gov/pubmed/27517667
## 10                                                                 https://asa.scitation.org/doi/abs/10.1121/1.408329
## 11                                                                       https://www.ncbi.nlm.nih.gov/pubmed/20227475
## 12                                                                       https://www.ncbi.nlm.nih.gov/pubmed/20517155
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     Abstract
## 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         "<p> <strong> Background: </strong> Power reflectance measurements are an active area of research related to the development of noninvasive middle-ear assessment methods. There are limited data related to test-retest measures of power reflectance. </p>\n\n<p> <strong> Purpose: </strong> This study investigates test-retest features of power reflectance, including comparisons of intrasubject versus intersubject variability and how ear-canal measurement location affects measurements. </p>\n\n<p> <strong> Research design: </strong> Repeated measurements of power reflectance were made at about weekly intervals. The subjects returned for four to eight sessions. Measurements were made at three ear-canal locations: a deep insertion depth (with a foam plug flush at the entrance to the ear canal) and both 3 and 6 mm more lateral to this deep insertion. </p>\n\n<p> <strong> Study sample:</strong> Repeated measurements on seven subjects are reported. All subjects were female, between 19 and 22 yr old, and enrolled at an undergraduate women's college. </p>\n\n<p> <strong> Data collection and analysis: </strong> Measurements on both the right and left ears were made at three ear-canal locations during each of four to eight measurement sessions. Random-effects regression models were used for the analysis to account for repeated measures within subjects. The mean power reflectance for each position over all sessions was calculated for each subject. </p>\n\n<p> <strong> Results: </strong> The comparison of power reflectance from the left and right ears of an individual subject varied greatly over the seven subjects; the difference between the power reflectance measured on the left and that measured on the right was compared at 248 frequencies, and depending on the subject, the percentage of tested frequencies for which the left and right ears differed significantly ranged from 10% to 93% (some with left values greater than right values and others with the opposite pattern). Although the individual subjects showed left-right differences, the overall population generally did not show significant differences between the left and right ears. The mean power reflectance for each measurement position over all sessions depended on the location of the probe in the ear for frequencies of less than 1000 Hz. The standard deviation between subjects' mean power reflectance after controlling for ear (left or right) was found to be greater than the standard deviation within the individual subject's mean power reflectance. The intrasubject standard deviation in power reflectance was smallest at the deepest insertion depths. </p>\n\n<p> <strong> Conclusions: </strong>  All subjects had differences in power reflectance between their left and right ears at some frequencies; the percentage of frequencies at which differences occurred varied greatly across subjects. The intrasubject standard deviations were smallest for the deepest probe insertion depths, suggesting clinical measurements should be made with as deep an insertion as practically possible to minimize variability. This deep insertion will reduce both acoustic leaks and the effect of low-frequency ear-canal losses. The within-subject standard deviations were about half the magnitude of the overall standard deviations, quantifying the extent of intrasubject versus intersubject variability. </p>"
## 2  "<p> <strong> Objectives: </strong> Wideband acoustic immittance (WAI) measures such as\npressure reflectance, parameterized by absorbance and group delay,\nequivalent admittance at the tympanic membrane (TM), and acoustic\nstapedius reflex threshold (ASRT) describe middle ear function across a\nwide frequency range, compared with traditional tests employing a single\nfrequency. The objective of this study was to obtain normative data\nusing these tests for a group of normal-hearing adults and investigate\ntest-retest reliability using a longitudinal design. </p>\n\n<p> <strong> Design: </strong> A longitudinal prospective design was used to obtain normative\ntest and retest data on clinical and WAI measures. Subjects were 13\nmales and 20 females (mean age = 26 years). Inclusion criteria included\nnormal audiometry and clinical immittance. Subjects were tested on two\nseparate visits approximately 1 month apart. Reflectance and equivalent\nadmittance at the TM were measured from 0.25 to 8.0 kHz under\nthree conditions: at ambient pressure in the ear canal and with pressure\nsweeps from positive to negative pressure (downswept) and negative to\npositive pressure (upswept). Equivalent admittance at the TM was calculated\nusing admittance measurements at the probe tip that were adjusted\nusing a model of sound transmission in the ear canal and acoustic estimates\nof ear-canal area and length. Wideband ASRTs were measured\nat tympanometric peak pressure (TPP) derived from the average TPP\nof downswept and upswept tympanograms. Descriptive statistics were\nobtained for all WAI responses, and wideband and clinical ASRTs were\ncompared. </p>\n\n<p> <strong> Results: </strong> Mean absorbance at ambient pressure and TPP demonstrated\na broad band-pass pattern typical of previous studies. Test-retest differences\nwere lower for absorbance at TPP for the downswept method\ncompared with ambient pressure at frequencies between 1.0 and\n1.26 kHz. Mean tympanometric peak-to-tail differences for absorbance\nwere greatest around 1.0 to 2.0 kHz and similar for positive and negative\ntails. Mean group delay at ambient pressure and at TPP were greatest\nbetween 0.32 and 0.6 kHz at 200 to 300 &mu;sec, reduced at frequencies\nbetween 0.8 and 1.5 kHz, and increased above 1.5 kHz to around 150\n&mu;sec. Mean equivalent admittance at the TM had a lower level for the\nambient method than at TPP for both sweep directions below 1.2 kHz,\nbut the difference between methods was only statistically significant for\nthe comparison between the ambient method and TPP for the upswept\ntympanogram. Mean equivalent admittance phase was positive at all frequencies.\nTest-retest reliability of the equivalent admittance level ranged\nfrom 1 to 3 dB at frequencies below 1.0 kHz, but increased to 8 to 9\ndB at higher frequencies. The mean wideband ASRT for an ipsilateral\nbroadband noise activator was 12 dB lower than the clinical ASRT, but\nhad poorer reliability. </p>\n\n<p> <strong> Conclusions: </strong>  Normative data for the WAI test battery revealed minor\ndifferences for results at ambient pressure compared with tympanometric methods at TPP for reflectance, group delay, and equivalent\nadmittance level at the TM for subjects with middle ear pressure within\n&plusmn;100 daPa. Test-retest reliability was better for absorbance at TPP for\nthe downswept tympanogram compared with ambient pressure at frequencies\naround 1.0 kHz. Large peak-to-tail differences in absorbance\ncombined with good reliability at frequencies between about 0.7 and\n3.0 kHz suggest that this may be a sensitive frequency range for interpreting\nabsorbance at TPP. The mean wideband ipsilateral ASRT was\nlower than the clinical ASRT, consistent with previous studies. Results\nare promising for the use of a wideband test battery to evaluate middle\near function. </p>\n<p> <strong> Key words: </strong> Absorbed sound power, Acoustic stapedius reflex threshold,\nReflectance, Test-retest reliability, Wideband acoustic immittance. </p>"
## 3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   "<p> <strong> Objective: </strong> Accurate ear-canal acoustic measurements, such as wideband\nacoustic admittance, absorbance, and otoacoustic emissions,\nrequire that the measurement probe be tightly sealed in the ear canal.\nAir leaks can compromise the validity of the measurements, interfere\nwith calibrations, and increase variability. There are no established procedures\nfor determining the presence of air leaks or criteria for what size\nleak would affect the accuracy of ear-canal acoustic measurements. The\npurpose of this study was to determine ways to quantify the effects of air\nleaks and to develop objective criteria to detect their presence. </p>\n\n<p> <strong> Design: </strong> Air leaks were simulated by modifying the foam tips that are\nused with the measurement probe through insertion of thin plastic\ntubing. To analyze the effect of air leaks, acoustic measurements were\ntaken with both modified and unmodified foam tips in brass-tube cavities\nand human ear canals. Measurements were initially made in cavities\nto determine the range of critical leaks. Subsequently, data were collected\nin ears of 21 adults with normal hearing and normal middle-ear\nfunction. Four acoustic metrics were used for predicting the presence of\nair leaks and for quantifying these leaks: (1) low-frequency admittance\nphase (averaged over 0.1-0.2 kHz), (2) low-frequency absorbance, (3)\nthe ratio of compliance volume to physical volume (CV/PV), and (4) the\nair-leak resonance frequency. The outcome variable in this analysis was\nthe absorbance change (&Delta;absorbance), which was calculated in eight\nfrequency bands. </p>\n\n<p> <strong> Results: </strong> The trends were similar for both the brass cavities and the ear\ncanals. &Delta;Absorbance generally increased with air-leak size and was largest\nfor the lower frequency bands (0.1-0.2 and 0.2-0.5 kHz). Air-leak\neffects were observed in frequencies up to 10 kHz, but their effects above\n1 kHz were unpredictable. These high-frequency air leaks were larger in\nbrass cavities than in ear canals. Each of the four predictor variables\nexhibited consistent dependence on air-leak size. Low-frequency admittance\nphase and CV/PV decreased, while low-frequency absorbance and\nthe air-leak resonance frequency increased. </p>\n\n<p> <strong> Conclusion: </strong>  The effect of air leaks can be significant when their equivalent\ndiameter exceeds 0.01 in. The observed effects were greatest at low frequencies\nwhere air leaks caused absorbance to increase. Recommended\ncriteria for detecting air leaks include the following: when the frequency\nrange of interest extends as low as 0.1 kHz, low-frequency absorbance\nshould be &#8804;0.20 and low-frequency admittance phase &#8805;61 degrees. For\nfrequency ranges as low as 0.2 kHz, low-frequency absorbance should\nbe &#8804;0.29 and low-frequency admittance phase &#8805;44 degrees. </p>"
## 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    "<p> A method to transform the impedance measured in the ear canal, <i>Z<sub>EC</sub></i>, to the plane of the eardrum,\n<i>Z<sub>ED</sub></i>, is described. The portion of the canal between the probe and eardrum was modeled as a concatenated\nseries of conical segments, allowing for spatial variations in its cross-sectional area. A\nmodel of the middle ear (ME) and cochlea terminated the ear-canal model, which permitted estimation\nof ME efficiency. Acoustic measurements of <i>Z<sub>EC</sub></i> were made at two probe locations in 15\nnormal-hearing subjects. <i>Z<sub>EC</sub></i> was sensitive to measurement location, especially near frequencies of\ncanal resonances and anti-resonances. Transforming <i>Z<sub>EC</sub></i> to <i>Z<sub>ED</sub></i> reduced the influence of the canal,\ndecreasing insertion-depth sensitivity of <i>Z<sub>ED</sub></i> between 1 and 12 kHz compared to <i>Z<sub>EC</sub></i>. Absorbance,\nA, was less sensitive to probe placement than <i>Z<sub>EC</sub></i>, but more sensitive than <i>Z<sub>ED</sub></i> above 5 kHz. <i>Z<sub>ED</sub></i>\nand A were similarly insensitive to probe placement between 1 and 5 kHz. The probe-placement\nsensitivity of <i>Z<sub>ED</sub></i> below 1 kHz was not reduced from that of either A or <i>Z<sub>EC</sub></i>. ME efficiency had a\nbandpass shape with greatest efficiency between 1 and 4 kHz. Estimates of <i>Z<sub>ED</sub></i> and ME efficiency\ncould extend the diagnostic capability of wideband-acoustic immittance measurements.</p>"
## 5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                "<p> A system with potential for middle-ear screening and diagnostic testing was developed for the\nmeasurement of wideband energy absorbance (EA) in the ear canal as a function of air pressure, and\ntested on adults with normal hearing. Using a click stimulus, the EA was measured at 60 frequencies\nbetween 0.226 and 8 kHz. Ambient-pressure results were similar to past studies. To perform\ntympanometry, air pressure in the ear canal was controlled automatically to sweep between &#x2212;300\nand 200 daPa (ascending/descending directions) using sweep speeds of approximately 75, 100, 200,\nand 400 daPa/ s. Thus, the measurement time for wideband tympanometry ranged from 1.5 to 7 s\nand was suitable for clinical applications. A bandpass tympanogram, calculated for each ear by\nfrequency averaging EA from 0.38 to 2 kHz, had a single-peak shape; however, its tympanometric\npeak pressure (TPP) shifted as a function of sweep speed and direction. EA estimated at the TPP was\nsimilar across different sweep speeds, but was higher below 2 kHz than EA measured at ambient\npressure. Future studies of EA on normal ears of a different age group or on impaired ears may be\ncompared with the adult normal baseline obtained in this study.</p>"
## 6                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              "<p> <strong> Objective: </strong> This study compares measurements of ear-canal reflectance\n(ECR) to other objective measurements of middle ear function including\naudiometry, umbo velocity (<i>V<sub>U</sub></i>), and tympanometry in a population of\nstrictly defined normal-hearing ears. </p>\n\n<p> <strong> Design: </strong> Data were prospectively gathered from 58 ears of 29 normalhearing\nsubjects, 16 females and 13 males, aged 22 to 64 yr. Subjects\nmet all of the following criteria to be considered as having normal\nhearing: (1) no history of significant middle ear disease; (2) no history\nof otologic surgery; (3) normal tympanic membrane on otoscopy; (4)\npure-tone audiometric thresholds of 20 dB HL or better for 0.25 to 8\nkHz; (5) air-bone gaps no greater than 15 dB at 0.25 kHz and 10 dB for\n0.5 to 4 kHz; (6) normal, type-A peaked tympanograms; and (7) all\nsubjects had two ""normal"" ears (as defined by these criteria). Measurements\nincluded pure-tone audiometry for 0.25 to 8 kHz, standard\n226 Hz tympanometry, ECR for 0.2 to 6 kHz at 60 dB SPL using the\nMimosa Acoustics HearID system, and umbo velocity (<i>V<sub>U</sub></i>) for 0.3 to\n6 kHz at 70 to 90 dB SPL using the HLV-1000 laser Doppler\nvibrometer (Polytec Inc). </p>\n\n<p> <strong> Results: </strong> Mean power reflectance (|ECR|<sup>2</sup>) was near 1.0 at 0.2 to 0.3 kHz,\ndecreased to a broad minimum of 0.3 to 0.4 between 1 and 4 kHz, and\nthen sharply increased to almost 0.8 by 6 kHz. The mean pressure\nreflectance phase angle (&#8736;ECR) plotted on a linear frequency scale\nshowed a group delay of approximately 0.1 msec for 0.2 to 6 kHz. Small\nsignificant differences were observed in |ECR|<sup>2</sup> at the lowest frequencies\nbetween right and left ears and between males and females at 4 kHz.\n|ECR|<sup>2</sup> decreased with age but reached significance only at 1 kHz. Our\nECR measurements were generally similar to previous published reports.\nHighly significant negative correlations were found between\n|ECR|<sup>2</sup> and <i>V<sub>U</sub></i> for frequencies below 1 kHz. Significant correlations were\nalso found between the tympanometrically determined peak total\ncompliance and |ECR|<sup>2</sup> and <i>V<sub>U</sub></i> at frequencies below 1 kHz. The results\nsuggest that middle ear compliance contributes significantly to the\nmeasured power reflectance and umbo velocity at frequencies below 1\nkHz but not at higher frequencies. </p>\n\n<p> <strong> Conclusions: </strong>  This study has established a database of objective\nmeasurements of middle ear function (ECR, umbo velocity, tympanometry)\nin a population of strictly defined normal-hearing ears. These data\nwill promote our understanding of normal middle ear function and will\nserve as a control for comparison to similar measurements made in\npathological ears. </p>"
## 7                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      "<document> <p> <strong> <i> Objective: </i> </strong> This study examined differences between\nthe middle ears of two ethnic groups, Caucasian\nand Chinese young adults with normal hearing,\nusing a new middle-ear measurement technique,\nwideband energy reflectance. The goal of this study\nwas to determine whether the Chinese group had\ndifferent middle-ear transmission properties than\nthe Caucasian group. </p>\n\n<p> <strong> <i> Design: </i> </strong> There were 126 subjects (237 ears) between\nthe ages of 18 and 32 yr, with 62 subjects in the\nCaucasian group and 64 subjects in the Chinese\ngroup. Wideband energy reflectance data were gathered\nusing Mimosa Acoustics (RMS system version\n4.0.4.4) wideband reflectance (WBR) equipment. </p>\n\n<p> <strong> <i> Results: </i> </strong> The Chinese group had significantly lower\nwideband energy reflectance than their Caucasian\ncounterparts at higher frequencies; however, the\nCaucasian group had significantly lower energy reflectance\nat lower frequencies than the Chinese\ngroup. The Chinese group also had significantly\nlower admittance magnitude than the Caucasian\ngroup at lower frequencies. Because body size indices\nare more comparable between Caucasian females and\nChinese males, the effect of body size could be potentially\nadjusted for by comparing the Caucasian female\nsubjects with the Chinese male subjects. The\ndifferences observed between the Caucasian and the\nChinese groups were no longer significant when the\nCaucasian female subjects were compared with the\nChinese male subjects. Applying the Caucasian norms\nto four Caucasian adults with surgically confirmed\notosclerosis resulted in an improved hit rate compared\nwith the combined Caucasian and Chinese\nnorms and the Chinese-only norms. </p>\n\n<p> <strong> <i> Conclusions: </i> </strong>  Body size may play a role in the observed\ndifferences between the Caucasian and Chinese\ngroups. The findings of this study suggest that\nfurther research is needed to investigate the effects\nof body size on wideband energy reflectance. It\nshould be noted that factors other than body size\nmay contribute to the observed differences. Chinese\nindividuals may simply have different middle-ear\ncharacteristics than Caucasian individuals, which\ncould affect WBR. In the meantime, overall test\nperformance may be improved by using a more homogeneous norm when evaluating the middle ear\nof Caucasian or Chinese individuals with WBR. </p>\n<p><small>(Ear & Hearing 2006;27;774-788)</small></p> </document>"
## 8                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               "<p> The wideband energy reflectance (ER) technique has become popular as a tool for evaluating middle\near function. Negative middle ear pressure (MEP) is a prevalent form of middle ear dysfunction,\nwhich may impact application of ER measurements in differential diagnosis. A negative MEP may\nbe countervailed by application of an equivalent negative ear canal pressure. The present study\nexamined ER in the same ears under normal and experimentally induced negative MEP conditions.\nThirty-five subjects produced at least one negative MEP each (&#x2212;40 to &#x2212;225 daPa). Negative MEP\nsignificantly altered ER in a frequency-specific manner that varied with MEP magnitude. ER\nincreased for low- to mid-frequencies with the largest change (~0.20 to 0.40) occurring between 1\nand 1.5 kHz. ER decreased for frequencies above 3 kHz with the largest change (~&#x2212;0.10 to &#x2212;0.25)\nobserved between 4.5 and 5.5 kHz. Magnitude of changes increased as MEP became more negative,\nas did the frequencies at which maximum changes occurred, and the frequency at which enhancement\ntransitioned to reduction. Ear canal pressure compensation restored ER to near baseline values.\nThis suggests that the compensation procedure adequately mitigates the effects of negative\nMEP on ER. Theoretical issues and clinical implications are discussed.</p>"
## 9                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          "Purpose: The purpose of this study was to present normative data of tympanometric measurements of wideband acoustic immittance and to characterize wideband tympanograms.\n\nMethod: Data were collected in 84 young adults with strictly defined normal hearing and middle ear status. Energy absorbance (EA) was measured using clicks for 1/12-octave frequencies (0.236 to 8 kHz), with the ear canal air pressure systematically varied (+200 to -300 daPa). In 40 ears, 7 consecutive trials and a trial of clinical 226-Hz acoustic admittance (Ya) tympanometry followed. A cavity test was also conducted.\n\nResults: From the wideband EA tympanogram, several EA spectrums and EA tympanograms were derived. Descriptive statistics were performed, and population parameters were estimated. The immediate test-retest reliability was excellent. Effects of ear canal air pressure on EA were examined comprehensively. Differences in EA between tympanometric and ambient-pressure measurements were significant. Single-frequency EA tympanograms exemplified for half-octave frequencies were contrasted. The bandpass EA tympanogram, 0.236- and 1-kHz EA and Ya tympanograms, and 226-Hz Ya tympanogram were compared in 9 variables.\n\nConclusions: This study established a database of wideband tympanograms in healthy adults. The data analyses will promote our understanding of the middle ear transfer function. These data will serve as a reference for further studies in clinical populations."
## 10                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               "<p> The pressure reflectance <i>R (&omega;)</i> is the transfer function which may be defined for a linear one-port\nnetwork by the ratio of the reflected complex pressure divided by the incident complex pressure.\nThe reflectance is a function that is closely related to the impedance of the 1-port. The energy\nreflectance <i>&#8475;(&omega;)</i> is defined as <i> |R|<sup>2</sup></i>. It represents the ratio of reflected to incident energy. In the\nhuman ear canal the energy reflectance is important because it is a measure of the inefficiency of\nthe middle ear and cochlea, and because of the insight provided by its simple frequency domain\ninterpretation. One may characterize the ear canal impedance by use of the pressure reflectance\nand its magnitude, sidestepping the difficult problems of (a) the unknown canal length from the\nmeasurement point to the eardrum, (b) the complicated geometry of the drum, and (c) the\ncross-sectional area changes in the canal as a function of distance. Reported here are acoustic\nimpedancem easurements looking into the ear canal, measured on ten young adults with normal \nhearing (ages 18-24). The measurement point in the canal was approximately 0.85 cm from the\nentrance of the canal. From these measurements, the pressure reflectance in the canal is\ncomputed and impedance and reflectance measurements from 0.1 to 15.0 kHz are compared\namong ears. The average reflectance and the standard deviation of the reflectance for the ten\nsubjects have been determined. The impedance and reflectance of two common ear simulators,\nthe Br&uuml;el & Kjaer 4157 and the Industrial Research Products DB-100 (Zwislocki) coupler are\nalso measured and compared to the average human measurements. All measurements are made\nusing controls that assure a uniform accuracy in the acoustic calibration across subjects. This is\ndone by the use of two standard acoustic resistors whose impedances are known. From the\nexperimental results, it is concluded that there is significant subject variability in the magnitude\nof the reflectance for the ten ear canals. This variability is believed to be due to cochlear and\nmiddle ear impedance differences. An attempt was made at modeling the reflectance but, as\ndiscussed in the paper, several problems presently stand in the way of these models. Such models\nwould be useful for acoustic virtual-reality systems and for active noise control earphones.</p>"
## 11                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               "<p> Several studies have demonstrated that the auditory system is sensitive to changes in posture, presumably\nthrough changes in intracranial pressure (ICP) that in turn alter the intracochlear pressure, which\naffects the stiffness of the middle-ear system. This observation has led to efforts to develop an ear-canal\nbased noninvasive diagnostic measure for monitoring ICP, which is currently monitored invasively via\naccess through the skull or spine. Here, we demonstrate the effects of postural changes, and presumably\nICP changes, on distortion product otoacoustic emissions (DPOAE) magnitude, DPOAE angle, and power\nreflectance. Measurements were made on 12 normal-hearing subjects in two postural positions: upright\nat 90&#176; and tilted at -45&#176; to the horizontal. Measurements on each subject were repeated five times across\nfive separate measurement sessions. All three measures showed significant changes (<i>p</i> < 0.001) between\nupright and tilted for frequencies between 500 and 2000 Hz, and DPOAE angle changes were significant\nat all measured frequencies (500-4000 Hz). Intra-subject variability, assessed via standard deviations for\neach subject's multiple measurements, were generally smaller in the upright position relative to the\ntilted position.</p>"
## 12                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      "<p> <strong> Objectives: </strong> Wideband acoustic transfer functions (WATF) measured in\nthe ear canal have been shown to be effective in the diagnosis of middle\near dysfunction in adults and in newborn infants. Although these\nmeasures would be diagnostically useful in older infants, normative data\non a large number of older infants are lacking. The goal of this study\nwas to provide such normative data. </p>\n\n<p> <strong> Design: </strong> The WATF of 458 infants aged 2 to 9 mos and of 210 adults\nwere obtained. Wideband reactance (X), resistance (R), and energy\nreflectance (ER) were measured in third-octave bands between 250 and\n8000 Hz. The effects of age and gender on the WATF were examined,\nand the WATF in the left and right ears were compared. Test-retest\nreliability was assessed, and the relationship between the 226-Hz\ntympanogram and the WATF was examined. </p>\n\n<p> <strong> Results: </strong> The results agreed well with previous reports testing fewer\nsubjects, which documented age-related change in these measures\nduring infancy and between infancy and adulthood. Test-retest correlations\nwithin third octaves were 0.5 to 0.7 at best, but did not vary\nsystematically with age. Infants' test-retest absolute differences within\nthird octaves for R and ER were similar to those of adults. The shape of\nthe WATF on retest was highly repeatable, and the shapes of the WATF\nin the ears of the same individual were qualitatively similar. The\nwideband impedance results were not different in the left and right ears,\nbut ER was slightly, but significantly, lower in the left ear than that in the\nright ear. Resistance and reactance magnitude were greater for females\nthan males, but there was no effect of gender on ER. Infants whose\n226-Hz tympanogram indicated reduced peak admittance (Types As and\nB) had greater resistance and reactance magnitude than those with\nnormal peak admittance (Types A and C), but no tympanometry group\ndifferences were evident in ER. </p>\n\n<p> <strong> Conclusions: </strong>  Age-graded norms are essential to the successful clinical\napplication of WATF. However, the effects of gender and laterality on the\nWATF are small.</p>"
##                      DataSubmitterName
## 1                           Susan Voss
## 2  M. Patrick Feeney; Douglas H. Keefe
## 3                          Steve Neely
## 4                          James Lewis
## 5                        Douglas Keefe
## 6                        John Rosowski
## 7                        Navid Shahnaz
## 8                        Xiao-Ming Sun
## 9                        Xiao-Ming Sun
## 10                          Susan Voss
## 11                          Susan Voss
## 12                       Douglas Keefe
##                                   DataSubmitterEmail DateSubmitted
## 1                                    svoss@smith.edu       8/24/16
## 2  Patrick.Feeney@va.gov; Douglas.Keefe@boystown.org        6/7/18
## 3                         Stephen.Neely@boystown.org       6/18/19
## 4                                  jdlewis@uthsc.edu      10/10/18
## 5                         Douglas.Keefe@boystown.org       6/26/18
## 6                     John_Rosowski@meei.harvard.edu       11/6/15
## 7                        nshahnaz@audiospeech.ubc.ca       8/24/16
## 8                          xiao-ming.sun@wichita.edu       10/6/18
## 9                          xiao-ming.sun@wichita.edu      10/31/17
## 10                                   svoss@smith.edu        2/8/17
## 11                                   svoss@smith.edu        6/5/18
## 12                        Douglas.Keefe@boystown.org        9/1/17
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         PI_Notes
## 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Measurements made on 7 subjects across multiple sessions and 3 probe locations for each subject.  Database includes measurements at only the deepest insertion depth (Position 1) and Channel B only.
## 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   Database includes measurements on 32 subjects, most with left and right ears and most with two sessions.  Each session includes WAI measurements at ambient ear canal, and the ear canal pressurized to TPP, negative tail pressure (-300 daPa) and positive tail pressure (200 daPa) with both "Downswept" and "Upswept" conditions.  Ear-canal areas used to calculate absorbance were estimated acoustically and are listed for the ambient and TPP measurements.  Impedance measurements at the probe location are only available for the ambient ear-canal pressure measurements and are "NULL" for the pressurized canal measurements.
## 3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             Data collected on system described in Rasetshwane and Neely (2011)
## 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               "Custom system ER10B+ used for measurement of ear canal impedance designed by Dr. J. H. Siegel at Northwestern University.\n\nEar canal areas were estimated from the surge impedance.\n\nImpedance was measured at 2 insertion depths (deep - session 1, shallow - session 2)."
## 5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
## 6  "HearID (Mimosa Acoustics); \nNormal Criteria as follows: \n(1) There was no history of significant middle ear disease (e.g., otitis media or effusion 2 or more years previously were not considered significant if there were no known residual consequences).\n(2) There was no history of otologic surgery, with the exception of myringotomy or tympanostomy tube placement over 2 yr prior. \n(3) The external ear and TM revealed no abnormalities on otoscopic examination. \n(4) Audiometric measurements had pure-tone thresholds of 20 dB HL or better at octave frequen- cies between 0.250 and 8 kHz. \n(5) Air-bone gaps were no greater than 15 dB at 0.25 kHz and 10 dB between frequencies of 0.5 to 4 kHz. Most subjects had air and bone thresholds between 0 and 10 dB HL with an average near 8 to 9 dB HL at the highest frequencies. \n(6) Tympanograms were Type-A peaked, with peak pressures of 100 to 50 daPa, static compliance of 0.3 to 2.0 cc, total tympanometric volumes (static compliance ear canal volume) between 0.7 and 2.7 cc, and normal-appearing shape that is neither rounded nor sharp. \n(7) All subjects included in the ""normal hearing"" population were required to have two ""normal"" ears (as defined by criteria described earlier). "
## 7                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
## 8                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         A research version of Titan (Interacoustics) was used. In this study, a total of five reflectance measurements at ambient pressure were taken per ear (detailed in the article).  Included in this database is the baseline session, defined as C in the paper, as the normative data.
## 9                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
## 10                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
## 11                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
## 12                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     Used an ER-1 earphone and ER-7C microphone.  Data provided by Doug Keefe and formatted by Susan Voss with help.  Lynne Werner is retired.
# be careful with collect() when dealing with large tables!

Note how the number of rows is unknown (?? at the top of the output above) for the lazy query.

Similarly, we can explore the Subjects table.

Subject  %>% summarise(total = n())
## # Source:   lazy query [?? x 1]
## # Database: mysql 5.5.58-0ubuntu0.14.04.1-log [waiuser@scidb.smith.edu:/wai]
##   total
##   <dbl>
## 1   640
Subject %>% collect()  # be careful with collect() with large tables!
## # A tibble: 640 x 10
##    Identifier Sub_Number Session_Total   Age Female  Race Ethnicity
##    <chr>      <chr>              <int> <int>  <int> <int>     <int>
##  1 Voss_2010  1                      5    20      1     0         0
##  2 Voss_2010  2                      5    39      1     0         0
##  3 Voss_2010  3                      5    18      1     0         0
##  4 Voss_2010  4                      5    19      1     0         0
##  5 Voss_2010  5                      5    21      1     0         0
##  6 Voss_2010  6                      5    21      1     0         0
##  7 Voss_2010  7                      5    21      1     0         0
##  8 Voss_2010  8                      5    42      1     0         0
##  9 Voss_2010  9                      5    38      0     0         0
## 10 Voss_2010  10                     5    20      1     0         0
## # ... with 630 more rows, and 3 more variables: Left_Ear_Status <int>,
## #   Right_Ear_Status <int>, Sub_Notes <chr>

Let’s explore the Measurements table.

Measurements %>% summarise(total = n())
## # Source:   lazy query [?? x 1]
## # Database: mysql 5.5.58-0ubuntu0.14.04.1-log [waiuser@scidb.smith.edu:/wai]
##    total
##    <dbl>
## 1 286774

There are more than a quarter million observations.

In the next step, we will download the data from a given subject for a specific study, in this case a paper by Rosowski et al (2012) entitled “Ear-canal reflectance, umbo velocity, and tympanometry in normal-hearing adults”.

Arbitrarily we choose to collect data from subject number three.

onesubj <- 
  Measurements %>% 
  filter(Identifier == "Rosowski_2012", Sub_Number == 3) %>%
  collect %>%
  mutate(SessionNum = as.factor(Session))
head(onesubj)
## # A tibble: 6 x 12
##   Identifier Sub_Number Session Left_Ear   MEP Instrument  Freq Absorbance
##   <chr>           <int>   <int>    <int> <dbl>      <int> <dbl>      <dbl>
## 1 Rosowski_~          3       1        1    NA          1  211.     0.0852
## 2 Rosowski_~          3       1        1    NA          1  234.     0.0903
## 3 Rosowski_~          3       1        1    NA          1  258.     0.112 
## 4 Rosowski_~          3       1        1    NA          1  281.     0.103 
## 5 Rosowski_~          3       1        1    NA          1  305.     0.129 
## 6 Rosowski_~          3       1        1    NA          1  328.     0.136 
## # ... with 4 more variables: Zmag <dbl>, Zang <dbl>, Canal_Area <dbl>,
## #   SessionNum <fct>

Finally we can display the results of the measurements as a function of frequency and which ear (left or right) that was used.

onesubj <- mutate(onesubj, Ear = ifelse(Left_Ear == 1, "Left", "Right"))
ggplot(onesubj, aes(x = Freq, y = Absorbance)) + geom_point() +
  aes(colour = Ear) + scale_x_log10() + labs(title="Absorbance by ear Rosowski subject 3")

In summary, while SQL is a powerful tool, there are straightforward ways to integrate existing databases into analyses using a small number of commands. Particularly if instructors use RMarkdown, data ingestation can be scaffolded with students able to modify and augment code that is provided to them.

We note that a number of relational database systems exist, including MySQL (illustrated here), PostgreSQL, and SQLite. More information about databases within R can be found in the CRAN Databases with R Task View.

Setting up and managing a database is a topic for a different day: here we focused on how SQL can be used within R to access data in a flexible and powerful manner.

About this blog

Each day during the summer of 2019 we intend to add a new entry to this blog on a given topic of interest to educators teaching data science and statistics courses. Each entry is intended to provide a short overview of why it is interesting and how it can be applied to teaching. We anticipate that these introductory pieces can be digested daily in 20 or 30 minute chunks that will leave you in a position to decide whether to explore more or integrate the material into your own classes. By following along for the summer, we hope that you will develop a clearer sense for the fast moving landscape of data science. Sign up for emails at https://groups.google.com/forum/#!forum/teach-data-science (you must be logged into Google to sign up).

We always welcome comments on entries and suggestions for new ones.