Triangle Research Data Center: More than Counting People

Wednesday, May 14, 2014

“Unfortunately, it’s one of Duke’s well-kept secrets,” said Gale Boyd, executive director of the Triangle Research Data Center, now located on the 2nd floor of Gross Hall. The Triangle RDC allows researchers whose proposals have been approved by the U.S. Census Bureau to access confidential microdata not released to the public. The Triangle RDC is the only one in the Southeast, and one of only nine around the country. “There are a lot of people who could avail themselves of this fairly distinctive resource, but don’t even know it’s here.” That was brought home to Boyd last October, when Duke hosted the Census RDC national research conference. “Someone came up to me and said, ‘I’m so glad to know about this.’ I thought it was someone from one of the other schools. Turns out the person was from the sociology department, one building over from us.” What makes the Census RDCs uniquely valuable to researchers is both the level of geographical and demographic detail, and the broad range of microdata researchers can access. “Instead of looking at all of Durham County, research can be done in some cases with individual household level data. It all still operates under very strong confidentiality restrictions. None of that data ever leaves, in fact none of it even physically resides here at Duke,” Boyd explained. Secure computers in the RDC lab connect to the Census Bureau headquarters in Maryland, where the actual computing is done. “Most people think of the census every 10 years, but the range of data is much more than that.

An economic census is conducted every five years, with lots of statistical surveys sampling specific population groups for specific information on households as well as businesses,” Boyd said. In addition, the Census Bureau has agreements with two major agencies that gather data on healthcare: the National Center for Health and Statistics (NCHS) and the Agencies for Healthcare Research and Quality (AHRQ), to provide access via the TRDC to those data as well. Through a funding partnership with UNC, any researcher from Duke or the UNC system can use the Triangle RDC at no cost. “I like to say that using the RDC is not free for Duke and UNC faculty, it’s paid for,” said Boyd. “If you had to get a grant to pay the fees to use the data center, you would have a chicken and egg problem. This way you can do preliminary research on your own research time and demonstrate a research concept without paying fees, and then go out and apply for grants for travel money and to pay grad students. The university and the research community all benefit from that.” Fees charged at other TRDCs can be $10-15 thousand per project per year. Recent research conducted with Triangle RDC data include a Duke study by Kirk White that overturns stereotypes about gentrification in urban neighborhoods, featured in TIME Magazine, and a UNC-Chapel Hill study by Quinfang Wang on ethnic divisions of labor in U.S. cities, soon to be published in The Annals of Association of American Geographers. “The confidential data are extremely precious for researchers,” said Wang, now an assistant professor of geography and earth sciences at UNC-Charlotte.

The lab administrator is Bert Grider, a UNC-Chapel Hill graduate student and full-time U.S. Census Bureau employee. Grider assists users, safeguards data confidentiality, and guides researchers through the proposal process. “I’m the gatekeeper, but my job isn’t to keep researchers out. Part of my job description is to help people get in the door.” "The proposal process is different from writing an NSF proposal or a standard funding proposal. Since federal law only authorizes the Census Bureau to collect data, a proposal must show how the research will benefit the Census Bureau's data collection programs. I help researchers tailor their proposals to meet this legal requirement. I also help researchers determine what data are available, in that I can look at particular data sets and tell researchers, yes, the variable you’re interested in is there, or that it's possible to merge one data set with another. A lot of times it’s merging the different data sets that puts the pieces together to make the research puzzle work,” Grider said. “We’re constantly working with the Census Bureau to help people navigate the proposal process and make sure they can get their projects approved in a timely manner so they can do the research they want to do,” Boyd added. He has been using confidential data in his own research for 15 years, and remembers when he used to travel from Chicago to Washington D.C. to get the kind of access his lab offers. “It is a much more efficient way to get research done,” said Boyd. He is something of an expert on efficiency—he currently develops benchmarks for measuring energy efficiency and productivity in manufacturing plants for the EPA’s Energy Star program. “I’m both a user of the RDC as well as the director. Most of my time is as a researcher,” said Boyd. “In the very competitive academic and research market, having something like that, using the confidential data, makes research stand out.”