Interdisciplinary data science workshop moves from San Diego to spare bedrooms, but science continues

Written by 

Editor's Note: The information published in this story is accurate at the time of publication. Always refer to for UAB's current guidelines and recommendations relating to COVID-19.

Jake Chen, Ph.D.Southside, not San Diego: Jake Chen, Ph.D., chief bioinformatics officer for UAB's Informatics Institute, leads a discussion during the BIOKDD workshop that he organized with assistant professor Da Yan, Ph.D., of the Department of Computer Science. The virtual version of the two-decade-old workshop brought to light several advantages of online science events, Chen said.Academic conferences would never be mistaken for spring break parties, but when the clock nears 5 p.m., most attendees' thoughts have turned from professional interests to the cocktail mixers that often cap each day.

So it was unusual, to say the least, to hear a participant in BIOKDD 2020: International Workshop on Data Mining in Bioinformatics beg his colleagues' pardon for dropping out of a post-talk discussion with the words, "It's past midnight here, so I will need to sign off — I need to get up early tomorrow." Welcome to scientific gatherings, COVID-style.

Annual meetup for bioinformatics researchers

For the past 19 years, data scientists working in bioinformatics have been convening in BIOKDD workshops during the annual conference of the Association for Computing Machinery's Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD), now the world’s largest data science conference. BIOKDD is named in honor of its parent, the KDD conference.

The workshops are designed "to promote interdisciplinary research in the areas of data science and computational biology/bioinformatics," said Jake Chen, Ph.D., chief bioinformatics officer and associate director for UAB's Informatics Institute, who has been general chair of BIOKDD for more than 15 years. "For the past three years, Dr. Da Yan [assistant professor in the Department of Computer Science] and I have been organizing it together with other co-organizers worldwide."

The BIOKDD workshops usually draw anywhere from 50 to 100 participants representing academia, industry and government labs, Chen says. Like many major meetings, the parent SIGKDD conference tends to take place in sought-after travel destinations around the world, including Sydney, London, Paris, Beijing and New York City.

‘We actually gained participation’

In 2020, SIGKDD and BIOKDD were set to convene by the beach in San Diego. Instead, with COVID-19 rampant and travel impossible, Chen, Yan and their co-organizers were forced to regroup. "When we organized the workshop this year back in January, we only had a biomedical research interest," Chen said. By late March, however, "It didn't take us long to decide to make COVID-19 a theme for this year," Chen said — and to refashion the workshop in a virtual format.

"We actually gained participation" with the switch, Chen added. He attributes this to "the significance of the topic, the quality of presenters that we invited from around the world and the online format itself."

Instead of gathering in a hotel ballroom, participants dialed in via Zoom from their labs and home offices. "Most of the session talks were done live, but since this was an international workshop, some had to be delivered as recorded video, given the time zone differences for Asian and European participants," Chen said. The event was run on U.S. Pacific time, as it would have been if the workshop was taking place in-person in San Diego.

Deep learning, disparities and databases

Keynote speaker Geoffrey C. Fox, Ph.D., director of the Digital Science Center at Indiana University Bloomington, gave a talk titled Deep Learning for Biomedical and Science Time Series, including a demonstration of predictive models that his lab had trained on daily COVID-19 infection statistics.

BIOKDD screenshotA screenshot from one of the BIOKDD presentations. Watch the entire proceedings here.Li Li, Ph.D., vice president of Clinical Informatics at the “health intelligence” company Sema4 and assistant professor at the Icahn School of Medicine at Mount Sinai, was the other keynote speaker. She shared her work in analyzing the Mount Sinai Health System electronic medical record system to uncover racial disparities and identify prognostic factors in the COVID-19 hospitalized patient population.

Among many other sessions, Reid Thompson, M.D., Ph.D., assistant professor at Oregon Health and Sciences University, presented his efforts to map susceptibility to SARS-CoV-2 based on the diversity of human leukocyte antigen variants around the world. And Chen presented his research on PAGER-CoV, an online, curated gene signature database offering resources for functional genomic studies of COVID.

Participants were positive about the experience, Chen noted. "They said, 'This was an excellent workshop,' 'We learned a lot' and 'We will come back to attend again next year,'" he said.

A special issue on artificial intelligence-enabled data science for COVID-19, edited by Chen, Yan and others, will appear in the journals Frontiers in Big Data: Medicine and Public Health and Frontiers in Artificial Intelligence: Medicine and Public Health. Submissions are open to both workshop participants and general contributors, Chen says.

Hybrid is 'here to stay'

What advice does Chen have to colleagues on organizing a successful virtual conference? "Putting together a great program is a must," he said. "Advertising through both conventional media — emails, physical flyers, school-wide websites — and social media (Twitter and Google) is important. Also, having backup plans for 'no-shows' is essential, too."

Chen said that video-based virtual meetings will last beyond the COVID-19 pandemic. "I do believe all future conference organizers will incorporate a degree of virtual, online, interactive content from high-quality speakers worldwide," he said. "Perhaps not entirely virtual, but a hybrid model is here to stay.” Next year, BIOKDD “will still incorporate virtual meeting components — for example, giving invited speakers the option to present virtually — even if the meeting will be held in person."

Dig in on COVID data science

The Informatics Institute has created a COVID-19 biomedical data science portal at with analysis tools, data sets and training materials.