Researchers see hope, progress in big data

Studies across vast information sets are aiding patient testing, treatment

Updated July 29, 2022

An algorithm applied to a cat’s medical data can help predict chronic kidney disease two years earlier than traditional diagnosis.

Developing that tool involved using artificial intelligence to analyze more than 100,000 patient medical records and detect previously hidden patterns.

Darren Logan, PhD, head of research at the Waltham Petcare Science Institute, said that, as more cats are tested using the RenalTech tool, more cat owners switch their pets to diets that could slow disease development. He expects to see in 5-10 years whether those tests and interventions are reducing overall disease.

Waltham scientists developed the tests using data collected from veterinary hospitals over the past 20 years. The institute, which is owned by Mars Inc., is now partnering with Mars-owned Banfield, BluePearl, and VCA hospitals to develop an extensive biobank over a decade, with data from 10,000 dogs and 10,000 cats. The biobank is one example of the large-scale data projects—or big data—that could lead to more advanced diagnostics and earlier detection of myriad diseases.

Dr. Logan said the biobank will include recorded results of routine wellness checks and tests on blood and fecal samples, genome sequences for those pets, gut bacteria analysis, and information provided by owners through questionnaires about home environments and lifestyles.

“From a data perspective, it will be the biggest single collection of biological data in companion animals’ history,” he said.

Dr. Audrey Ruple said researchers are starting to see true, whole-life information about animals. The ability to combine vast amounts of clinical, genetic, climate, and environmental data is changing how researchers look at risk, propensity scores, health outcomes, and how clinicians can prevent disease, she said.

Dr. Ruple is an associate professor of quantitative epidemiology in the Department of Population Health Sciences at the Virginia-Maryland College of Veterinary Medicine. She also is part of the research team and a member of the executive operations team for the Dog Aging Project, which is collecting various data from tens of thousands of dogs to better understand the factors that influence changes in aging dogs. The project incorporates medical records, survey information from dog owners, other information from veterinarians, biochemical profiles, ability tests, and environmental data.

All of that helps researchers find the true drivers of disease occurrence, including the interactions between genetics and environmental exposures, Dr. Ruple said.

“Being able to determine those things really helps us to help animals to live longer, healthier lives,” she said.

Much of that information also could translate into advancements in human health care, she added.

Collecting and applying data

Dr. Ruple is the corresponding author for a review article, published in 2021 by the journal Animals, on big data in veterinary medicine. She and her co-authors wrote that veterinary medical data may be underutilized in medical research.

Humans and other animals often develop similar diseases with similar genetic or external causes and similar clinical outcomes. Plus, working with veterinary data presents fewer challenges related to privacy and confidentiality concerns, the article states. Dogs have the most phenotypic diversity and known naturally occurring diseases among land mammals. They develop about 400 inherited disorders relevant to human medicine, and they share humans’ physical and chemical environments, the authors write.

A 2017 article published by Frontiers in Veterinary Science also describes the potential to use big data to advance veterinary epidemiology by helping identify animals at high risk of infectious disease as well as to spot anomalous events that could serve as warnings. The article expounds on the need for scientists to adapt by developing skills in subjects such as machine learning and coding that could help veterinary epidemiologists engage, manipulate, and analyze large data sets.

Academic and nonprofit institutions in the U.S., United Kingdom, and Australia have started or become involved in data aggregation projects, some of which have collected millions to tens of millions of records, according to the Association for Veterinary Informatics. Those projects are aiding work such as efforts to improve health outcomes, support judicious antimicrobial use, understand genetic causes of diseases in animals, and identify the health effects of climate and environment.

In addition to data repositories for veterinary records, there are dedicated registries such as the previously mentioned Dog Aging Project, which has collected information on tens of thousands of dogs, and Morris Animal Foundation’s Golden Retriever Lifetime Study, which gathers health, environmental, and behavioral data from more than 3,000 dogs each year.

At a December 2021 workshop for the National Academies of Sciences, Engineering, and Medicine, Dr. Ruple was among presenters on the roles of companion animals as sentinels for predicting the effects in humans of environmental exposures, specifically effects on aging and cancer susceptibility. She said human medicine is increasingly seeing the value of using the large volumes of data collected on pets, and she sees potential for more cross-discipline work.

“There are signs of improvement in terms of utilizing big data, and I think that part of it is people are starting to really recognize the value of it,” Dr. Ruple said. “There’s a lot of benefit to using veterinary health data sets as compared to using human health data sets.”

The Dog Aging Project, for example, includes collaborations with gerontologists, epidemiologists, computer scientists, geologists, and geographers.

“We’ve got all of these different disciplines that are truly working together in an integrated way to look at moving human health and veterinary health knowledge forward,” she said.

Dr. Rachael Kreisler, immediate past president of the Association for Veterinary Informatics and associate professor of shelter medicine and epidemiology at  Midwestern University College of Veterinary Medicine, said analyses of veterinary projects’ data have led to clinical insights published in hundreds of scientific articles. But she also noted some drawbacks, including patient populations at academic institutions that differ from patient populations in general practice and the challenges of combining veterinary data from various sources.

Significant amounts of patient data in veterinary medicine are recorded in unstructured free text, which can be difficult to parse for meaning. Even when structured diagnoses are entered, they are often unique to the practice or even the doctor making the diagnosis. Aggregating patient records requires reconciling those differences, and such data cleaning can be labor intensive.

Dr. Kreisler said this lack of structure prevents clinicians and researchers from accessing the depth and value of the data generated by veterinarians. She recommends that companion animal veterinary practices adopt the Problem and Diagnosis Terms developed by the American Animal Hospital Association, which are freely available. There is also standardized terminology for equine and specialty practices. These standardized terms allow veterinarians to “speak the same language,” enabling both clinicians and researchers to have insight into their clinical data.

While using standard terms might seem like one more hassle in a busy veterinarian’s day, Dr. Kreisler said standard terminology is critical for advancing clinical care and advocated for veterinarians to put pressure on practice software vendors to implement standardized diagnostic codes and terminology. She gave the example of how a common language could allow veterinarians to set key performance indicators for medical outcomes, much as they have been created for financial outcomes, demonstrating where practices may be able to improve patient care.

“It would also give veterinarians essential tools for client communication, allowing them to convey the value of particular diagnostics and procedures in ways that help clients to participate in medical decision making meaningfully,” Dr. Kreisler said.

Dr. Logan of Waltham said data storage is another substantial cost for big data projects. So are trained experts in data analysis. During the past five years, the institute has hired dozens of data scientists to work on long-term health data and is looking for scientists of various disciplines in academia and at companies who are interested in partnering with Waltham to analyze health data.

Long-haired cat
A cat that was tested for renal disease with a tool developed using large-scale patient data (Courtesy of Antech Diagnostics)

Turning data into tools

Dr. Jimmy Barr, chief medical officer for Mars-owned BluePearl Specialty and Emergency Pet Hospital, said the ability to compare data across millions of medical records lets researchers characterize conditions in exciting new ways.

BluePearl alone provided care for about 850,000 pets in 1.3 million visits during 2021, according to the BluePearl’s 2021 Pet Health Trends Report.

Studies that take advantage of large amounts of data also can help clinicians see the most efficient ways to care for patients, as well as produce a safer environment for patients and doctors. BluePearl data have already been used to implement more structure during rounds to reduce mistakes during patient handoffs, and ongoing studies could show the influence of the COVID-19 pandemic on hospital workloads and efficiency.

But Dr. Barr said he is the most interested in finding answers to questions about how to provide the best care in specific scenarios. That could take the form of decision trees or algorithms regarding which antimicrobial is the most likely to be effective for a particular infection.

“We are so early in the journey and the art of using data to really help our patients,” Dr. Barr said. “And I think that we have such an opportunity—all of us do—to collaborate in order to be able to do this.”

Dr. Ruple of the Dog Aging Project also is chair of the veterinary advisory board for pet insurance company Fetch by The Dodo. She said the company has been using machine learning and artificial intelligence to analyze 16 years of data on health outcomes for more than 500,000 dogs. One study, for example, identified a drop in anxiety-related claims coinciding with a rise in overall behavior-related claims as people began staying home during 2020 in response to the COVID-19 pandemic, and those results provided a warning that signs of anxiety could rebound as people return to offices.

Dr. Kreisler of the Association for Veterinary Informatics hopes that recent innovations in veterinary medicine demonstrate to veterinarians the value of the medical data they generate. These innovations include predictive algorithms for diagnosis of Addison’s disease in dogs as well as for the progression of chronic kidney disease progression in cats, computer-generated interpretation of radiographs, automated pain scoring from photographs of cats, and blood tests that improve preclinical cancer detection in dogs—all of which were developed using large veterinary data sets. While technologies such as natural language processing—which automate the analysis of unstructured data—are likely to play a role in the future, their development is, ironically, held back by a lack of coded data from which to learn.

Future analysis also could help pet owners decide which expenses to prioritize, Dr. Kreisler said. Clients may make different decisions on whether to approve a $300 diagnostic test if they know it will benefit one in two, one in 500, or one in 10,000 animals.

And the wealth of data generated is increasing every day, Dr. Kreisler said. Automatic feeders can read microchips to determine how often pets eat or drink, for example.

If the Mars biobank project is as successful as hoped, Dr. Logan said, “The benefits to our business and to the health of pets in general will be so great that the logic of continuing this beyond the 10 years will be overwhelming.

“And so we see this not only as a 10-year one-off project, but we actually see this ultimately as the future of veterinary care.”

Correction: An earlier version of this article misstated the affiliation of Dr. Audrey Ruple.

National canine cancer registry to provide incidence, prevalence data

Three companies announced in late May that they are launching a national canine cancer registry to provide the veterinary community and dog owners with incidence and prevalence data to help guide diagnosis and treatment of cancer in dogs.

Co-sponsored by Jaguar Animal Health, the communications agency TogoRun, and Ivee, which offers veterinary software for pet health records, the initiative will initially access information about canine cancer from a multiyear Gallup survey of U.S. dog owners and a retrospective review of more than 35,000 anonymous patient records with more than 800 confirmed cancer diagnoses.

The Gallup survey, conducted in March 2022, found that the prevalence of U.S. dogs with cancer in 2021 was 3.4%, less than the approximately 5% prevalence in humans that year. The survey also found that the incidence of U.S. dogs newly diagnosed with cancer in 2021 was 2.8%, approximately five times the 0.57% incidence of newly diagnosed cancer in humans that year.

“Protecting dogs from cancer begins with knowing its impact by breed, type, age, gender, and location,” said Dr. Terry Fossum, a member of the scientific advisory board for the registry, in the May announcement. She is co-founder of Dr. Fossum’s Pet Care and CEO of Epic Veterinary Specialists. “The U.S. has lagged behind other countries where there are multiple canine health registries and there have been several attempts by other groups to establish a U.S. registry without success.”

Among the scientific advisory board’s activities are driving adoption of a consistent diagnostic coding system for canine cancer and supporting the goals of the National Cancer Institute’s Comparative Oncology Program. The board is encouraging veterinary clinics to adopt coding practices that align with the recently published Veterinary International Classification of Diseases for Oncology Canine Tumors First Edition.

Data from the canine cancer registry is accessible to the public via an interactive dashboard on the registry website, and clinical practitioners and academics have open access to the data for research purposes. The registry will grow as veterinary clinics and pet owners upload medical records of dogs with cancer.

Dog owners and members of the veterinary community can visit TakeChargeRegistry.com for more information, including how to upload medical records.

A version of this article appears in the September 2022 print issue of JAVMA.