Cancer impacts millions of individuals worldwide, but a lot is unknown about the complex disease. However, University scientists recently conducted a computational data analysis on six different cancer types and discovered that changes in the arrangement and structure of chromosomes come as the result of the unusual binding of CTCF. The research team hopes its findings can be applied to gain a greater understanding of cancer biology and potentially lead to a cancer treatment.
CTCF is a transcription factor that controls gene expression by maintaining chromatin structure, which is the “bead-like” organization of DNA. The unusual binding in cancer cells causes differences in chromatin structure, which ultimately result in changes to gene expression.
Chongzhi Zang, assistant professor in the Center for Public Health Genomics and principal investigator of the study, was exploring complicated, publicly available data regarding chromosome structure when he was inspired to find potential patterns related to cancer.
“We have known the human genome sequence for about 20 years — since the completion of the Human Genome Project — but the next question is how to integrate the human genome,” Zang said. “In other words, how do the genes work in the genome?”
Researchers in Zang’s lab conducted an integrative computational analysis on publicly available data for six cancer types including acute myeloid leukemia, breast cancer, lung cancer and prostate cancer.
“Around the time 2018 or 2019, there were four cancer types available, and then, we had an idea of what we can focus on after we get the whole idea of the whole project,” said Zhenjia Wang, a postdoctoral research associate in Zang’s lab. “Then we wrap up the story and at that time more data will be publicly available, so we just re-analyze all the data and we collect more data to expand our analysis.”
Wang was responsible for collecting publicly available raw data for cancerous and normal human tissue cells. The data consisted of short segments of sequenced DNA, which were mapped onto the genome to form one data set.
Wang cleaned up the sets to remove irrelevant data and compiled 771 data sets to create a master data set, which was used for computational analysis. This investigation required a range of computation tools and methods, and while some of these tools were publicly available from other researchers, others were novel methods that Wang developed.
“The main research direction in my lab is to develop new computational methods for analyzing genomic data,” Zang said. “So [Wang] combined the tools that she developed, as well as other tools, and that’s why we call this integrative computational analysis.”
As Wang organized and analyzed data, she had to develop a level of mastery of the variety of techniques used for chromosome sequencing to dissect valuable data from unwanted information. Additionally, Wang had to learn how to conduct computational analysis across the publicly available datasets when the information was acquired through different sequencing techniques.
Using these analyses, the research team was able to determine specific patterns of CTCF binding for each cancer type. For instance, CTCF did not appear where it should be in the cell in the presence of certain cancers, indicating cancer-specific loss. On the other hand, in some cancer types, it appeared where it should not be, indicating cancer-specific gain.
The researchers validated their results in collaboration with a laboratory at Northwestern University where its scientists would investigate computational findings by conducting DNA sequencing experiments.
“All of our computational work is describing what was exactly happening [at] the molecular level,” Zang said. “The only thing is that we need real wet-lab experiments to validate our findings, so that's what our collaborator did.”
The researchers at Northwestern University conducted genomic experiments solely on cell cultures of T-cells — which play a critical role in immunity towards foreign substances — specific to acute lymphoblastic leukemia. They used genomic sequencing techniques to measure patterns of transcription factors in the genome. In particular, they looked at the differences in chromatin structure between cancerous cells and normal cells.
The current understanding of cancer proliferation is complicated but relies on internal factors like DNA mutations and environmental factors such as a patient’s lifestyle. Although there are aspects of cancer cell proliferation that require more research, Zang cites that the research conducted in his laboratory is just one small piece of the large puzzle of cancer genomics.
Zang hopes to collaborate with other researchers to experimentally validate findings in the five other cancer types. Additionally, he plans to develop a more detailed understanding of the cellular mechanisms and functions of the patterns discovered in future studies.
“If we have the resources and collaborators, we want to validate these findings in the other five cancer types,” Zang said. “But we were confident that this cancer-specific CTCF binding is a signature that probably exists in every cancer type.”
Zang emphasizes that the team’s findings do not imply a cure or treatment for the disease. They are simply small steps that lead researchers closer to a better understanding of cancer. He hopes other scientists and industries can use results in accordance with other findings to eventually develop the solution for the disease.
“Our hope is that eventually other researchers and colleagues in both academia and industry, [such as] pharmaceutical companies, can take advantage of and [combine] all the findings in the field, including ours, helping them to develop more drugs or more novel therapeutics to fight cancer,” Zang said.