Class BeamDataOperator

  • All Implemented Interfaces:
    DataOperator, java.lang.AutoCloseable

    public class BeamDataOperator
    extends java.lang.Object
    implements DataOperator
    A DataOperator for Apache Beam transformations.
    • Method Detail

      • close

        public void close()
        Specified by:
        close in interface java.lang.AutoCloseable
        Specified by:
        close in interface DataOperator
      • getStream

        @SafeVarargs
        public final org.apache.beam.sdk.values.PCollection<StreamElement> getStream​(org.apache.beam.sdk.Pipeline pipeline,
                                                                                     Position position,
                                                                                     boolean stopAtCurrent,
                                                                                     boolean useEventTime,
                                                                                     AttributeDescriptor<?>... attrs)
        Create PCollection in given Pipeline from commit log for given attributes.
        Parameters:
        pipeline - the Pipeline to create PCollection in.
        position - position in commit log to read from
        stopAtCurrent - true to stop at recent data
        useEventTime - true to use event time
        attrs - the attributes to create PCollection for
        Returns:
        the PCollection
      • getStream

        @SafeVarargs
        public final org.apache.beam.sdk.values.PCollection<StreamElement> getStream​(@Nullable
                                                                                     java.lang.String name,
                                                                                     org.apache.beam.sdk.Pipeline pipeline,
                                                                                     Position position,
                                                                                     boolean stopAtCurrent,
                                                                                     boolean useEventTime,
                                                                                     AttributeDescriptor<?>... attrs)
        Create PCollection in given Pipeline from commit log for given attributes.
        Parameters:
        name - name of the consumer
        pipeline - the Pipeline to create PCollection in.
        position - position in commit log to read from
        stopAtCurrent - true to stop at recent data
        useEventTime - true to use event time
        attrs - the attributes to create PCollection for
        Returns:
        the PCollection
      • getBatchUpdates

        @SafeVarargs
        public final org.apache.beam.sdk.values.PCollection<StreamElement> getBatchUpdates​(org.apache.beam.sdk.Pipeline pipeline,
                                                                                           AttributeDescriptor<?>... attrs)
        Create PCollection from updates to given attributes.
        Parameters:
        pipeline - Pipeline to create the PCollection in
        attrs - attributes to read updates for
        Returns:
        the PCollection
      • getBatchUpdates

        @SafeVarargs
        public final org.apache.beam.sdk.values.PCollection<StreamElement> getBatchUpdates​(org.apache.beam.sdk.Pipeline pipeline,
                                                                                           long startStamp,
                                                                                           long endStamp,
                                                                                           AttributeDescriptor<?>... attrs)
        Create PCollection from updates to given attributes with given time range.
        Parameters:
        pipeline - Pipeline to create the PCollection in
        startStamp - timestamp (inclusive) of first update taken into account
        endStamp - timestamp (exclusive) of last update taken into account
        attrs - attributes to read updates for
        Returns:
        the PCollection
      • getBatchUpdates

        @SafeVarargs
        public final org.apache.beam.sdk.values.PCollection<StreamElement> getBatchUpdates​(org.apache.beam.sdk.Pipeline pipeline,
                                                                                           long startStamp,
                                                                                           long endStamp,
                                                                                           boolean asStream,
                                                                                           AttributeDescriptor<?>... attrs)
        Create PCollection from updates to given attributes with given time range.
        Parameters:
        pipeline - Pipeline to create the PCollection in
        startStamp - timestamp (inclusive) of first update taken into account
        endStamp - timestamp (exclusive) of last update taken into account
        asStream - create PCollection that is suitable for streaming processing (i.e. can update watermarks before end of input)
        attrs - attributes to read updates for
        Returns:
        the PCollection
      • getBatchSnapshot

        public final org.apache.beam.sdk.values.PCollection<StreamElement> getBatchSnapshot​(org.apache.beam.sdk.Pipeline pipeline,
                                                                                            AttributeDescriptor<?>... attrs)
        Create PCollection from snapshot of given attributes. The snapshot is either read from available storage or created by reduction of updates.
        Parameters:
        pipeline - Pipeline to create the PCollection in
        attrs - attributes to read snapshot for
        Returns:
        the PCollection
      • getBatchSnapshot

        public final org.apache.beam.sdk.values.PCollection<StreamElement> getBatchSnapshot​(org.apache.beam.sdk.Pipeline pipeline,
                                                                                            long fromStamp,
                                                                                            long untilStamp,
                                                                                            AttributeDescriptor<?>... attrs)
        Create PCollection from snapshot of given attributes. The snapshot is either read from available storage or created by reduction of updates.
        Parameters:
        pipeline - Pipeline to create the PCollection in
        fromStamp - ignore updates older than this stamp
        untilStamp - read only updates older than this timestamp (i.e. if this method was called at the given timestamp)
        attrs - attributes to read snapshot for
        Returns:
        the PCollection
      • hasDirect

        public boolean hasDirect()