Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract region with uint16 (bioformats) #1742

Open
alevangel opened this issue Dec 10, 2024 · 2 comments
Open

Extract region with uint16 (bioformats) #1742

alevangel opened this issue Dec 10, 2024 · 2 comments

Comments

@alevangel
Copy link

alevangel commented Dec 10, 2024

I'm encountering an issue when attempting to extract an image with a 16-bit color space using Bio-Formats source. I'm unsure if this is expected behavior.

Specifically, if I convert the image to 8-bit, the resulting values are between 0 and 15, which seems unusually small.

Questions

  1. Is there a intensity scaling issue? I noticed the following in the metadata: '0028,1040 Pixel Intensity Relationship': 'LOG', and I'm not sure if this might be a factor.
  2. I'm requesting an output image with JPEG encoding, so I would generally expect it to be 8-bit. Is there something I'm missing?

To reproduce

Env

python 3.10
large_image 1.30.4
large_image_source_bioformats (commit a5b46d1)
python-bioformats 4.1.0

Steps

Here the input dicom file: 1-1 2.zip

Use the following Python code to attempt reading the region (e.g. the whole slide in this example):

import large_image
import matplotlib.pyplot as plt
import numpy as np

level = 3
size = (4096, 3328)  # (w,h)
location = (0, 0)  # (y,x)

slide = large_image.open(filename)
mag = slide.getMagnificationForLevel(level)
tile_image, _ = self.slide.getRegion(
            region=dict(left=location[1],
                        top=location[0],
                        right=location[1]+size[1],
                        bottom=location[0]+size[0],
                        units='pixels'),
            tile_size=dict(width=size[1], height=size[0]),
            tile_offset=dict(x=0, y=0),
            scale=dict(magnification=mag['magnification']),
            format=large_image.constants.TILE_FORMAT_NUMPY,
            encoding="JPEG",
        )

# Convert to 8 bit
if tile_image.dtype == np.uint32:
    tile_image = np.floor_divide(tile_image, 2 ** (32-8)).astype(np.uint8)
elif tile_image.dtype == np.uint16:
    tile_image = np.floor_divide(tile_image, 2 ** (16-8)).astype(np.uint8)

# Show
plt.imshow(nparray)
plt.show()
@alevangel
Copy link
Author

Hi @manthey,

I've been working on improving the handling of DICOM images where Bits Stored is less than Bits Allocated, and I've developed a solution that addresses the issues I was encountering, particularly with images appearing too dark due to incorrect scaling.

While I understand that a definitive solution should ideally be implemented within the Bioformats module itself, I've implemented a workaround by adding a new method, _scalePixelValues, to the base TileSource class

class TileSource(IPyLeafletMixin):

Here's the code for the _scalePixelValues method:

def _scalePixelValues(self, tile: np.ndarray) -> np.ndarray:
        """
        Scales pixel values of a tile based on DICOM metadata.

        This method applies necessary transformations to pixel values, including:
        1. Rescale Slope and Intercept.
        2. Scaling to utilize the full range of the dtype when Bits Stored < Bits Allocated.
        3. Handling of bit shifting when High Bit is not aligned with Bits Stored.

        Args:
            tile: The input tile as a NumPy array.

        Returns:
            The scaled tile as a NumPy array.
        """

        # Scale based on Rescale Intercept and Slope
        if all(key in self._metadata['seriesMetadata'] for key in
               ('0028,1053 Rescale Slope', '0028,1052 Rescale Intercept')):
            rescale_slope = float(self._metadata['seriesMetadata']['0028,1053 Rescale Slope'])
            rescale_intercept = float(self._metadata['seriesMetadata']['0028,1052 Rescale Intercept'])
            tile = tile * rescale_slope + rescale_intercept
            tile = tile.astype(self._dtype)

        # Scaling pixel values to use the full range of the dtype
        bits_stored_keys = [
            'Bits Stored',
            '0028,0101 Bits Stored',
        ]
        high_bits_keys = [
            'High Bit',
            '0028,0102 High Bit',
        ]

        if self._metadata.get('bitsPerPixel'):
            bits_allocated = self._metadata['bitsPerPixel']
        elif self._metadata['seriesMetadata'].get('0028,0100 Bits Allocated'):
            bits_allocated = int(self._metadata['seriesMetadata']['0028,0100 Bits Allocated'])
        else:
            # Infer the bits per pixel from the dtype
            bits_allocated = tile.dtype.itemsize * 8

        for key in bits_stored_keys:
            if key in self._metadata['seriesMetadata']:
                bits_stored = int(self._metadata['seriesMetadata'][key])

                # Handling bit shift
                for high_bit_key in high_bits_keys:
                    if high_bit_key in self._metadata['seriesMetadata']:
                        high_bit = int(self._metadata['seriesMetadata'][high_bit_key])
                        if high_bit < bits_stored - 1:
                            shift = (bits_stored - 1) - high_bit
                            tile = tile << shift
                            mask = (1 << bits_stored) - 1
                            tile = tile & mask

                if bits_stored < bits_allocated:
                    scale_factor = 2 ** (bits_allocated - bits_stored)
                    tile = (tile.astype(np.int32) * scale_factor).astype(self._dtype)

                    # Clamping
                    if self._dtype == np.uint16:
                        tile = np.clip(tile, 0, 65535)
                    elif self._dtype == np.int16:
                        tile = np.clip(tile, -32768, 32767)

        return tile

I then call this method within the getRegion method of the TileSource class, right after obtaining the image region:

if hasattr(self, '_metadata'):
            image = self._scalePixelValues(cast(np.ndarray, image))

I initially attempted to implement this logic within the getTile function of the Bioformats module. However, I encountered issues due to the recursive nature of getTile when fetching regions that span multiple tiles. Applying the scaling within getTile resulted in multiple scaling operations, leading to incorrect pixel values.

By placing the scaling logic in a separate method and calling it from getRegion, the scaling is applied only once per region request, ensuring the correct pixel values are returned.

I believe this approach provides a reasonable workaround for handling DICOM images with these specific characteristics, although a more permanent solution within the Bioformats library would be ideal.

I'm sharing this code in the hope that it might be helpful to others facing similar issues and to solicit feedback or suggestions for improvement.

Thanks for your work on this project!

@dgutman
Copy link
Collaborator

dgutman commented Dec 18, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants