Data block API#136
Conversation
|
I crossposted your request for feedback to the FastAI discord and received the following response:
|
|
The comment is now adressed above under "Apply an encoding to a specific block only" |
|
I crossposted your request for feedback to the FastAI discord and received the following response:
|
|
The comment is now adressed above under "Apply an encoding to a specific block only" |
It would be nice to have an API for easily constructing learning methods as manually implementing all the methods can get tedious and the resulting methods don't compose well.
The API would be similar to fastai's data block API with the main difference that it is limited to learning methods, i.e. it keeps data container creation and task-specific data encoding separate (handling only the encoding).
Based on
Blocks which represent a kind of data andEncodings, transformations that encode data and are optionally invertible allowing the decoding of outputs.API
Best to give an example of what using it would look like. Below are reimplementations of some of FastAI.jl's computer vision methods.
Given just these short definitions and the block and encoding definitions and the right interfaces inplace, it would be possible to derive the following:
By grouping functionality by block or encoding, it would be much easier to compose and reuse different steps.
Status
Implemented:
BlockEncodingStatefulEncodingImageMaskLabelLabelMultiOneHotTensorImageTensorProjectiveTransforms(now works with 3D images, masks and keypoints)ImagePreprocessing(now works with 3D images)OneHot(for labels, multi-class labels and masks)To-do:
EncodingintoAbstractEncoding,EncodingandWrapperEncodingBlockintoAbstractBlock,BlockandWrapperBlockOnlyTaggedBlockThe encodings depend on some minor changes to DataAugmentation.jl, to be released soon.
How do I
Apply an encoding to multiple blocks
By default, an encoding transforms every block for which an
encodemethod is implemented. For exampleencode(ProjectiveTransforms(...), _, (Image(), Mask()), (img, mask)will encode both image and mask (with the same random state for the augmentations) whileencode(ProjectiveTransforms(...), _, (Image(), Label(classes)), (img, class)will encode only the image, and the class is passed through unchanged, since no methodencode(::ProjectiveTransforms, _, ::Label, _)is implemented.Apply an encoding to a specific block only
Let's say you have blocks of different types and an encoding implemented for all blocks, but you only want to encode a single block. This could be achieved with a wrapper encoding that only applies the wrapped encoding if a condition is met. The below example shows how
ProjectiveTransforms, which would encodeImageandMaskis wrapped so onlyImages are transformed.Now what if you had multiple blocks of the same type? We need a way to select which blocks to transform and which to leave be, but can no longer use the type to distinguish the blocks. Note that we cannot use indices of a tuple of blocks as selectors since the same set of encodings need to be callable on different sets of blocks (for example, during training inputs and targets are encoded together, and during inference inputs are encoded by themselves and model outputs are also decoded by themselves).
One solution is to introduce a wrapper block (yes, Julia is big on composition) that associates a tag with the block which can then be referenced in the encoding wrapper.
Write an encoding that combines multiple blocks
By default, applying an encoding to a tuple of blocks will apply the encoding to each block individually. This can be overwritten by implementing an
encodemethod that dispatches onTupleand can combine multiple blocks.Below example shows a transform that concatenates selected blocks:
Apply multiple encodings to the same block
Can be done with a wrapper transform that stores multiple encodings.
Create a learning method where model output block differs from the encoded target block
This has come up for me during segmentation where I used a custom loss function to weigh foreground losses, So instead of the loss function being
loss(y_pred, y)it wasweightedloss(y_pred, (y, weights)). Let's say we have an encodingCreateForegroundWeightsthat transforms aMaskblock to create aWeightsblock. We can use the aboveEncodingsto apply one-hot encoding and weight creation together (Encodings(OneHot(), CreateForegroundWeights())). If we transform the blocks(Image{2}(), Mask{2}())the output would bexblock, yblock = (ImageTensor(), (Mask(), Weights()). We can see that ourys would be compatible with the loss function, great. However, model outputs will be justMaskblocks which leads to a problem when decoding since by default the method expects the same target block (yblock). We can use theoutputblockkeyword argument to override this.Use blocks from different applications together
The API is application-agnostic so e.g. computer vision and tabular blocks can be used together. There is no special logic for vision methods. There are no default encodings associated with blocks as in fast.ai, every encoding is explicit.