Class PubSubDataAccessor

    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      org.apache.beam.sdk.values.PCollection<StreamElement> createBatch​(org.apache.beam.sdk.Pipeline pipeline, java.util.List<AttributeDescriptor<?>> attrs, long startStamp, long endStamp)
      Create PCollection for given attribute family's batch updates storage.
      org.apache.beam.sdk.values.PCollection<StreamElement> createStream​(java.lang.String name, org.apache.beam.sdk.Pipeline pipeline, Position position, boolean stopAtCurrent, boolean eventTime, long limit)
      Create PCollection for given attribute family's commit log.
      org.apache.beam.sdk.values.PCollection<StreamElement> createStreamFromUpdates​(org.apache.beam.sdk.Pipeline pipeline, java.util.List<AttributeDescriptor<?>> attributes, long startStamp, long endStamp, long limit)
      Create PCollection for given attribute family's batchUpdates.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Method Detail

      • createStream

        public org.apache.beam.sdk.values.PCollection<StreamElement> createStream​(java.lang.String name,
                                                                                  org.apache.beam.sdk.Pipeline pipeline,
                                                                                  Position position,
                                                                                  boolean stopAtCurrent,
                                                                                  boolean eventTime,
                                                                                  long limit)
        Description copied from interface: DataAccessor
        Create PCollection for given attribute family's commit log.
        Specified by:
        createStream in interface DataAccessor
        Parameters:
        name - name of the consumer
        pipeline - pipeline to create PCollection in
        position - to read from
        stopAtCurrent - stop reading at current data
        eventTime - true to use event time
        limit - limit number of elements read. Note that the number of elements might be actually lower, because it is divided by number of partitions It is useful mostly for testing purposes
        Returns:
        PCollection representing the commit log
      • createBatch

        public org.apache.beam.sdk.values.PCollection<StreamElement> createBatch​(org.apache.beam.sdk.Pipeline pipeline,
                                                                                 java.util.List<AttributeDescriptor<?>> attrs,
                                                                                 long startStamp,
                                                                                 long endStamp)
        Description copied from interface: DataAccessor
        Create PCollection for given attribute family's batch updates storage.
        Specified by:
        createBatch in interface DataAccessor
        Parameters:
        pipeline - pipeline to create PCollection in
        attrs - attributes to read
        startStamp - minimal update timestamp (inclusive)
        endStamp - maximal update timestamp (exclusive)
        Returns:
        PCollection representing the batch updates
      • createStreamFromUpdates

        public org.apache.beam.sdk.values.PCollection<StreamElement> createStreamFromUpdates​(org.apache.beam.sdk.Pipeline pipeline,
                                                                                             java.util.List<AttributeDescriptor<?>> attributes,
                                                                                             long startStamp,
                                                                                             long endStamp,
                                                                                             long limit)
        Description copied from interface: DataAccessor
        Create PCollection for given attribute family's batchUpdates. The created PCollection is purposefully treated as unbounded (although it is bounded, in fact), which gives better performance in cases when it is united with another unbounded PCollection.
        Specified by:
        createStreamFromUpdates in interface DataAccessor
        Parameters:
        pipeline - pipeline to create PCollection in
        attributes - attributes to read updates for
        startStamp - minimal update timestamp (inclusive)
        endStamp - maximal update timestamp (exclusive)
        limit - limit number of elements read. Note that the number of elements might be actually lower, because it is divided by number of partitions It is useful mostly for testing purposes
        Returns:
        PCollection representing the commit log