Predicting Cancer Symptom Trajectories Needs Longitudinal Memory

The key challenge is not just prediction. It is learning from sparse, irregular, real-world symptom data and turning it into timely, personalized care.

Most healthcare systems still treat symptoms as “encounter artifacts.” They get documented when a clinician sees you, not when the symptom happens. For cancer patients, that gap is not just inconvenient. It can be dangerous.

A paper in JCO Clinical Cancer Informatics takes this problem head-on: it asks whether we can predict a patient’s future symptom trajectory using longitudinal EHR symptom documentation, even when the data is sparse and the timing is irregular.

That framing matters for Aether, because it matches the core bet behind the health graph: if medicine is longitudinal, then the data structure must be longitudinal too.


The paper’s starting point is blunt: prediction is hard because EHR symptom data is messy

The authors begin with a simple clinical truth: if you can anticipate symptom severity and progression, you can intervene earlier and plan treatment better. But most systems cannot do that well today.

“Ability to predict symptom severity and progression across treatment trajectories would allow clinicians to provide timely intervention.”

Then they point to the real barrier: symptom documentation is not clean, frequent, or standardized the way many ML papers assume.

“Such predictions are difficult because of sparse and inconsistent assessment.”

In other words: this is exactly the kind of real-world longitudinal problem where “last observed value” is not enough, and where the ability to learn over time becomes the product.

What they built: symptom trajectory prediction from routinely collected nursing documentation

The study is a retrospective longitudinal analysis of hospitalized cancer patients (n = 208), trained to predict future symptom trajectories from prior symptom history. They compare three approaches: an LSTM recurrent neural network, linear regression, and random forest.

“We performed a retrospective, longitudinal analysis… (n = 208)… trained… to predict future symptom trajectories.”

The crucial detail is where the signal comes from. This is not a specialized research dataset built for ML. It is routine clinical documentation.

“We can successfully predict patients’ symptom trajectories… using routinely collected nursing documentation.”

That is the point Aether cares about: if the system can learn from “imperfect but abundant” clinical reality, it becomes deployable in the world, not just publishable.

The real insight: longitudinal models handle irregular time better than encounter-based heuristics

One reason trajectory prediction is hard is that symptom assessments are often irregular. The paper calls this out in plain terms.

“Patients’ assessments are irregular and may show dynamic, nonlinear symptom severity changes.”

In clinical practice, the default heuristic is often “what was the symptom last time.” The paper explicitly compares against that baseline (the last observation) and shows that trained models can outperform it.

“LSTM… linear regression… and random forest models were better than… the last observation.”

In the discussion, they explain why: simple encounter heuristics lose information, while longitudinal sequence models can incorporate irregular intervals.

“We assume the reason… is information loss… the LSTM-based model considers irregular time intervals.”

This is a clean articulation of why longitudinal representation is not optional. It is a prerequisite for prediction that works in real care.

What this implies for Aether: the health graph is the substrate for symptom intelligence

Aether’s health graph is designed to do something simple that most systems still fail at: preserve the patient’s evolving story over time.

This paper reinforces a key idea: prediction is only as good as the continuity of the data. If symptoms are captured as isolated checkboxes in isolated encounters, you will always be doing retrospective medicine.

An “EHR Lite” implementation is not about rebuilding a hospital EHR. It is about restoring longitudinal coherence where it is most often lost:

  • Outpatient and OPD symptom narratives
  • Patient-reported experiences between visits
  • Lab and treatment events that set the context for symptom changes
  • Caregiver notes and longitudinal observations that do not fit in a single encounter

The paper’s approach also aligns with a pragmatic principle we use at Aether: start with what exists in the workflow, then learn from it. Here, that workflow artifact is nursing documentation and flowsheet symptom assessments.

The bigger takeaway: move from documentation systems to learning systems

Symptom burden is not static. It changes with treatment cycles, recovery, complications, stress, sleep, nutrition, and dozens of latent factors. If health systems only store snapshots, they cannot learn trajectories.

This paper demonstrates a credible path: learn from sparse and irregular longitudinal data, and use that learning to individualize care.

For Aether, this strengthens the roadmap. The health graph is not just “organized records.” It is a substrate for trajectory intelligence: earlier alerts, better follow-up, and better coordination across time.

References

Chae, S., Street, W.N., Ramaraju, N., Gilbertson-White, S. Prediction of Cancer Symptom Trajectory Using Longitudinal Electronic Health Record Data and Long Short-Term Memory Neural Network. JCO Clinical Cancer Informatics. 2024;8:e2300039.

https://doi.org/10.1200/CCI.23.00039

https://pmc.ncbi.nlm.nih.gov/articles/PMC10948138/

https://pmc.ncbi.nlm.nih.gov/articles/PMC10948138/pdf/cci-8-e2300039.pdf