Originally appeared at https://www.wm.edu/as/data-science/researchlabs/geolab/news/geolab-2019.php
Rachel Oberman got a call from Dan Runfola one Thursday evening during her sophomore year.
Runfola was giving a presentation to the Bill and Melinda Gates Foundation, and needed corrected boundary maps. For all countries in the world. By Monday.
Oberman ’19 is the founder of the undergraduate research component of William & Mary’s geoLab and its first student director. The geoLab is a collection of student-led research initiatives mentored by Runfola, assistant professor in William & Mary’s Department of Applied Science. The students that work in the geoLab use satellite imagery and other data to craft artificial intelligence-driven products that NGOs and governmental agencies continue to find useful.
The Gates Foundation request is a good example. Oberman explained that foundation personnel wanted to make sure that their aid would be delivered to the proper place — existing maps compiled from copyrighted information were not always accurate at the necessary level of detail.
“Every country has national, state and district boundaries,” she said. “Every country in the world has something like that.”
One of the four all-undergraduate geoLab teams actually is called geoBoundaries. It’s dedicated to using satellite imagery and open-source data to produce the most accurate maps possible. To handle the Gates Foundation’s request, Oberman assembled a team of undergraduate cartographers for a weekend of woodshedding. They met that deadline.
“I was calling people Friday night and I was saying, ‘Hey buddy! What are you doing this weekend? Anything fun? Want to make some maps with me instead?” Oberman recalled.
Runfola says he founded the geoLab about three years ago to fill a need at the university.
“There were not any student-centric research labs that were heavily focused on the intersection between spatial data, satellite data and AI,” he said. “I wanted to create an opportunity for students to get involved in this area — and it has a lot of different foci within it.”
To address some of those foci, the geoLab is made up of four research teams. In addition to geoBoundaries, there are geoParsing, geoData and geoDev teams. Each team is student-led and has a different focus. Oberman is the geoDev team lead in addition to serving as student director of the entire geoLab initiative.
John Napoli ’20 is the geoData team lead. Matt Crittenden ’21 leads geoParsing and Josh Panganiban ’20 is team lead for geoBoundaries. There are 28 students, all undergraduates, working in the teams. Each team has its own set of projects and the work of two or more teams will often intertwine.
For example, the geoParsing group makes heavy use of satellite data. The geoLab has a working arrangement with the National Geospatial-Intelligence Agency, a combat support division of the U.S. Department of Defense. The agreement allows the geoLab personnel to request satellites be tasked in their hunt for raw data.
The work of the geoData team differs from geoBoundaries and geoParsing in that the members work with information that is closely associated with geographic boundaries, but does not come from a satellite. Runfola cites the example of census data.
“Census data is not a picture,” he said. “It’s tabular data — but it links with spatial data.”
And the geoData team assembles non-pictorial information such as census data, strategically linking it to corresponding work by the geoBoundaries team, Runfola said. The goal is to make available a set of integrated products that a user can download. Runfola said the students have been working on a worldwide geoBoundaries/geoData map of childhood health indicators.
“That is an incredibly ambitious goal that they’ve been working on for going on two years now,” he said. “I think now we have nearly global coverage for around 50 different variables. And we’re starting now to go down to state level, around the globe.”
Members of the geoLab are expected to devote a minimum of four hours a week to their team’s projects. Experienced lab members can become geoFellows, lab leaders who receive financial support.
The work of the geoLab requires a wide range of skills and viewpoints, and therefore the lab makeup is necessarily interdisciplinary. Oberman, the student director, is a computer science/data science double major, but the 28 current geoLab participants represent a wide swath of academic interests.
Much of the geoLab’s work involves application of a convolutional neural network, an image recognition and sorting technology that’s the basis of facial recognition software and similar computation tools. Runfola said that the geoDev team is most involved in convolutional neural network applications. It’s the team that tends to attract the data science and computer science majors.
“They do any kind of image recognition data, image parsing — commonly they use satellite images,” he said.
And the geoDev team is also responsible for the Sex Assumption Camera. It looks out on the Integrated Science Center atrium from the geoLab’s headquarters. There is a hand-printed note below the lens: “Come closer.”
You go closer, and you are caught by the Sex Assumption Camera. Your image appears on an adjacent monitor, a blue rectangle framing your face as facial recognition software adds your assumed sex to a tally in either a pink or blue category.
“It’s just for fun,” Runfola said, but another note — “Is this camera ethical?” — triggers thoughts about privacy issues in the 21st century.
Runfola says each team tends to attract different kinds of students. For instance, the geoParsing team includes majors from international relations, linguistics, English and art history.
“Let’s just say that people that are interested in intelligence work are drawn to the geoParsing team,” he said. “It’s because we work with the intelligence community a lot.”
Some geoParsing students are working in a partnership with the National Geospatial-Intelligence Agency (NGA) to track China’s activities in Latin America.
“This is not necessarily where China is giving money,” Runfola said. “But more like where they’re allocating people. Where are they putting resources? What are the types of political gains are they hoping for? We’re putting all of that on a map that will show the geographic contexts that China prefers.”
The geoLab activities operate within ongoing and revolving set of partnerships with government agencies, NGOs and the academic community. For instance, the geoBoundaries team partnered with Columbia University’s Center for International Earth Science Information Network on a Gates Foundation project aimed at identifying and correcting the borders of African states. Some of the partnerships involve non-disclosure agreements, but the majority of the geoLab’s products are freely available for download.
The geoLab initiative as it now exists had its genesis when Rachel Oberman walked into Runfola’s office during her freshman year. She needed his approval to take his Data and Decisionmaking class. She was armed with what she now calls “a bad version of a resume,” as she was interested in getting involved in research.
“He also was thinking about starting a research lab and he asked if I would like to work with him on developing this undergraduate research lab?” Oberman said. “And that’s how it all started. The following fall it was me sitting with my peers, trying to figure out how to turn their work into a project for an undergraduate team.”
And now Oberman is nearing graduation; she’ll leave William & Mary at the end of the fall 2019 semester. She has accepted a job offer as an analytics and machine learning technical consulting engineer with chipmaker Intel Corp. in Portland. The geoLab’s emphasis on undergraduate involvement has resulted in a high placement rate for jobs and internships.
“The students in geoLab are basically doing the same things that an entry-level analyst would be doing internally at these agencies that we partner with,” Runfola said.
The geoLab gives students deep dives into the world of artificial intelligence and machine learning, which mean the same thing in most contexts, Runfola says.
“When people ask me about that, the answer I like to give is that when I’m writing an academic proposal to the National Science Foundation or something, it’s ‘machine learning,’” he said. “If I’m writing to a private company or a corporate foundation, it’s ‘artificial intelligence.’ So, there is no difference at that level.”
At another level, Runfola said that artificial intelligence is more often used in referring to models that update themselves in real time, whereas machine learning is usually applied to data that isn’t necessarily updating itself in real time.
“But that’s an arbitrary distinction,” he said. “And ‘artificial intelligence’ does sound sexier.”