How to use the library to capture HDF files#

The Commandline Capture of HDF Files Tutorial introduced how to use the commandline to capture HDF files. The write_hdf_files function that is called to do this can also be integrated into custom Python applications. This guide shows how to do this.

Approach 1: Call the function directly#

If you need a one-shot configure and run application, you can use the function directly:

import asyncio
import sys

from pandablocks.asyncio import AsyncioClient
from pandablocks.commands import Put
from pandablocks.hdf import write_hdf_files


async def arm_and_hdf():
    # Create a client and connect the control and data ports
    async with AsyncioClient(sys.argv[1]) as client:
        # Put to 2 fields simultaneously
        await asyncio.gather(
            client.send(Put("SEQ1.REPEATS", 1000)),
            client.send(Put("SEQ1.PRESCALE", 1000)),
        )
        # Listen for data, arming the PandA at the beginning
        await write_hdf_files(client, scheme="/tmp/panda-capture-%d.h5", arm=True)


if __name__ == "__main__":
    # One-shot run of a co-routine
    asyncio.run(arm_and_hdf())

With the AsyncioClient as a Context Manager, this code sets up some fields of a PandA before taking a single acquisition. The code in write_hdf_files is responsible for arming the PandA.

Note

There are no log messages emitted like in Commandline Capture of HDF Files Tutorial. This is because we have not configured the logging framework in this example. You can get these messages by adding a call to logging.basicConfig like this:

logging.basicConfig(level=logging.INFO)

Approach 2: Create the pipeline yourself#

If you need more control over the pipeline, for instance to display progress, you can create the pipeline yourself, and feed it data from the PandA. This means you can make decisions about when to start and stop acquisitions based on the Data objects that go past. For example, if we want to make a progress bar we could:

import asyncio
import sys
import time

from pandablocks.asyncio import AsyncioClient
from pandablocks.commands import Arm, Put
from pandablocks.hdf import FrameProcessor, HDFWriter, create_pipeline, stop_pipeline
from pandablocks.responses import EndData, EndReason, FrameData, ReadyData


def print_progress_bar(fraction: float):
    # Print a simple progress bar, with a carriage return rather than newline
    # so that subsequent progress bars can be drawn on top
    print(f"{fraction * 100:5.1f}% [{'=' * int(fraction * 40):40s}]", end="\r")


async def hdf_queue_reporting():
    # Create the pipeline to scale and write HDF files
    pipeline = create_pipeline(
        FrameProcessor(), HDFWriter(scheme="/tmp/panda-capture-%d.h5")
    )
    try:
        async with AsyncioClient(sys.argv[1]) as client:
            # Gather data at 45MByte/s, should take about 60s
            repeats = 40000000
            await asyncio.gather(
                client.send(Put("SEQ1.REPEATS", repeats)),
                client.send(Put("SEQ1.PRESCALE", 0.5)),
            )
            progress = 0
            async for data in client.data(scaled=False, flush_period=1):
                # Always pass the data down the pipeline
                pipeline[0].queue.put_nowait(data)
                if isinstance(data, ReadyData):
                    # Data connection is ready, arm PandA
                    await client.send(Arm())
                elif isinstance(data, FrameData):
                    # Got some frame data, print a progress bar
                    progress += len(data.data)
                    print_progress_bar(progress / repeats)
                elif isinstance(data, EndData):
                    # We've done a single acquisition, check ok and return
                    assert data.reason == EndReason.OK, data.reason
                    break

    finally:
        start = time.time()
        print("\nClosing file...", end=" ")
        # Stop and wait for the pipeline to complete
        stop_pipeline(pipeline)
        print(f"took {time.time() - start:.1f} seconds")


if __name__ == "__main__":
    # One-shot run of a co-routine
    asyncio.run(hdf_queue_reporting())

This time, after setting up the PandA, we create the AsyncioClient.data iterator ourselves. Each Data object we get is queued on the first Pipeline element, then inspected. The type of object tells us if we should Arm the PandA, update a progress bar, or return as acquisition is complete.

In a finally block we stop the pipeline, which will wait for all data to flow through the pipeline and close the HDF file.

Performance#

The commandline client and both these approaches use the same core code, so will give the same performance. The steps to consider in optimising performance are outlined in How fast can we write HDF files?