CVE-2026-31963 Overview
CVE-2026-31963 is a heap buffer overflow vulnerability in HTSlib, a widely-used library for reading and writing bioinformatics file formats. The flaw exists in the CRAM (Compressed Reference-oriented Alignment Map) file format decoder, specifically in how the library handles features that appear beyond the extent of a CRAM record sequence. An off-by-one error in the boundary validation logic allows an attacker to write one controlled byte beyond the end of a heap buffer, potentially leading to arbitrary code execution.
Critical Impact
This heap buffer overflow vulnerability in HTSlib's CRAM decoder can be exploited through maliciously crafted files to crash applications, corrupt heap memory structures, or achieve arbitrary code execution on systems processing untrusted bioinformatics data.
Affected Products
- HTSlib versions prior to 1.21.1
- HTSlib versions 1.22.x prior to 1.22.2
- HTSlib version 1.23 (fixed in 1.23.1)
Discovery Timeline
- 2026-03-18 - CVE-2026-31963 published to NVD
- 2026-03-19 - Last updated in NVD database
Technical Details for CVE-2026-31963
Vulnerability Analysis
The vulnerability resides in HTSlib's CRAM format decoder, which handles compressed DNA sequence alignment data. CRAM employs reference-based compression to reduce file sizes by storing only the differences between alignment records and a reference sequence, rather than complete sequence data. These differences are encoded as "features" that indicate variations at specific positions.
The flaw stems from an off-by-one error in the boundary checking logic within the cram/cram_decode.c file. When decoding CRAM features, the code validates that feature positions fall within the bounds of the record sequence. However, the original implementation failed to properly account for edge cases where certain operations (like deletions, insertions with padding, or hard clips) legitimately occur at or after the last base of the sequence.
The vulnerable code path allowed a feature with an attacker-controlled position value to write one byte beyond the allocated heap buffer. While a single-byte overflow may seem limited, it can corrupt heap metadata or adjacent heap objects, enabling exploitation techniques such as heap grooming to achieve arbitrary code execution.
Root Cause
The root cause is an insufficient boundary validation check in the CRAM feature decoding logic. The original code used a simple comparison if (pos > cr->len+1) that failed to account for the different valid position ranges depending on the feature operation type. Operations like N (reference skip), P (padding), H (hard clip), and D (deletion) can legitimately occur at position cr->len+1, while other operations should be restricted to cr->len. This logic error allowed malicious CRAM files to specify feature positions that would trigger writes past the end of the sequence buffer.
Attack Vector
The attack requires a user to open a maliciously crafted CRAM file. This could occur through:
- Research Data Exchange: Bioinformatics researchers frequently share and download sequence alignment files from public repositories or collaborators
- Pipeline Processing: Automated bioinformatics pipelines that process externally-sourced CRAM files
- Web Services: Online bioinformatics tools that accept user-uploaded files for analysis
When a vulnerable version of HTSlib processes the malicious file, the heap buffer overflow is triggered during the feature decoding phase, potentially allowing the attacker to:
- Crash the application (denial of service)
- Corrupt memory in unexpected ways
- Achieve arbitrary code execution by carefully crafting heap layout
if (r) return r;
pos += prev_pos;
+ // Misplaced feature detection - before start is easy
if (pos <= 0) {
hts_log_error("Feature position %d before start of read", pos);
return -1;
}
- if (pos > seq_pos) {
- if (pos > cr->len+1)
+ // After end is more complicated as the sequence may be absent,
+ // and operations like deletions could occur after the end
+ // of the stored sequence. First quickly find out if the feature is
+ // on or after the last base.
+ if (cr->len != 0 && pos > cr->len) {
+ // Now check carefully to ensure it's allowed.
+ int32_t valid_end = (op == 'N' || op == 'P' || op == 'H' || op == 'D')
+ ? cr->len+1
+ : cr->len;
+ if (pos > valid_end) {
+ hts_log_error("Feature position %d after end of read", pos);
return -1;
+ }
+ }
+ if (pos > seq_pos) {
if (s->ref && cr->ref_id >= 0) {
if (ref_pos + pos - seq_pos > bfd->ref[cr->ref_id].len) {
static int whinged = 0;
Source: GitHub Commit Details
Detection Methods for CVE-2026-31963
Indicators of Compromise
- Unexpected crashes in applications using HTSlib when processing CRAM files
- Abnormal memory consumption patterns in bioinformatics pipeline processes
- Core dumps or segmentation faults from tools like samtools, bcftools, or custom applications linked against HTSlib
- Suspicious CRAM files with unusual feature position values in processing logs
Detection Strategies
- Monitor file processing applications for signs of heap corruption or unexpected termination
- Implement file integrity checks and source validation for CRAM files from external sources
- Use memory safety tools (AddressSanitizer, Valgrind) during development to detect heap overflows
- Review system logs for repeated crashes of bioinformatics applications processing external data
Monitoring Recommendations
- Enable application-level logging for HTSlib operations to identify malformed file processing attempts
- Configure crash monitoring for systems running bioinformatics pipelines that process external data
- Audit incoming CRAM files from untrusted sources before processing in production environments
- Deploy endpoint detection solutions to identify post-exploitation activities if code execution is achieved
How to Mitigate CVE-2026-31963
Immediate Actions Required
- Update HTSlib to version 1.23.1, 1.22.2, or 1.21.1 depending on your version branch
- Audit systems to identify all applications and pipelines using HTSlib for CRAM file processing
- Restrict processing of CRAM files from untrusted sources until patches are applied
- Review access controls for systems handling external bioinformatics data
Patch Information
The HTSlib maintainers have released fixed versions addressing this vulnerability:
- Version 1.23.1 for users on the 1.23 release
- Version 1.22.2 for users on the 1.22 release branch
- Version 1.21.1 for users on the 1.21 release branch
The fix improves the boundary checking logic to properly validate feature positions based on the operation type, ensuring that features cannot be placed beyond valid positions for the given sequence. For detailed information, refer to the GitHub Security Advisory and the patch commit 8bcc9907.
Workarounds
- There is no workaround for this vulnerability; upgrading to a patched version is required
- As a temporary risk reduction measure, avoid processing CRAM files from untrusted or unverified sources
- Consider using alternative formats (BAM/SAM) for untrusted data until patching is complete, though this only reduces exposure, not the underlying risk
# Update HTSlib to the latest patched version
# For systems using package managers:
apt-get update && apt-get install htslib
# For manual builds, download and compile the patched version:
wget https://github.com/samtools/htslib/releases/download/1.23.1/htslib-1.23.1.tar.bz2
tar -xjf htslib-1.23.1.tar.bz2
cd htslib-1.23.1
./configure && make && make install
# Verify the installed version
htsfile --version
Disclaimer: This content was generated using AI. While we strive for accuracy, please verify critical information with official sources.

