Package cz.o2.proxima.beam.core
Class BeamDataOperator
- java.lang.Object
-
- cz.o2.proxima.beam.core.BeamDataOperator
-
- All Implemented Interfaces:
DataOperator
,java.lang.AutoCloseable
public class BeamDataOperator extends java.lang.Object implements DataOperator
ADataOperator
for Apache Beam transformations.
-
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
close()
DataAccessor
getAccessorFor(AttributeFamilyDescriptor family)
GetDataAccessor
for givenAttributeFamilyDescriptor
.org.apache.beam.sdk.values.PCollection<StreamElement>
getBatchSnapshot(org.apache.beam.sdk.Pipeline pipeline, long fromStamp, long untilStamp, AttributeDescriptor<?>... attrs)
CreatePCollection
from snapshot of given attributes.org.apache.beam.sdk.values.PCollection<StreamElement>
getBatchSnapshot(org.apache.beam.sdk.Pipeline pipeline, AttributeDescriptor<?>... attrs)
CreatePCollection
from snapshot of given attributes.org.apache.beam.sdk.values.PCollection<StreamElement>
getBatchUpdates(org.apache.beam.sdk.Pipeline pipeline, long startStamp, long endStamp, boolean asStream, AttributeDescriptor<?>... attrs)
CreatePCollection
from updates to given attributes with given time range.org.apache.beam.sdk.values.PCollection<StreamElement>
getBatchUpdates(org.apache.beam.sdk.Pipeline pipeline, long startStamp, long endStamp, AttributeDescriptor<?>... attrs)
CreatePCollection
from updates to given attributes with given time range.org.apache.beam.sdk.values.PCollection<StreamElement>
getBatchUpdates(org.apache.beam.sdk.Pipeline pipeline, AttributeDescriptor<?>... attrs)
CreatePCollection
from updates to given attributes.DirectDataOperator
getDirect()
Repository
getRepository()
Retrieve repository associated with the operator.org.apache.beam.sdk.values.PCollection<StreamElement>
getStream(java.lang.String name, org.apache.beam.sdk.Pipeline pipeline, Position position, boolean stopAtCurrent, boolean useEventTime, AttributeDescriptor<?>... attrs)
CreatePCollection
in givenPipeline
from commit log for given attributes.org.apache.beam.sdk.values.PCollection<StreamElement>
getStream(org.apache.beam.sdk.Pipeline pipeline, Position position, boolean stopAtCurrent, boolean useEventTime, AttributeDescriptor<?>... attrs)
CreatePCollection
in givenPipeline
from commit log for given attributes.boolean
hasDirect()
void
reload()
Reload the operator afterRepository
has been changed.
-
-
-
Method Detail
-
close
public void close()
- Specified by:
close
in interfacejava.lang.AutoCloseable
- Specified by:
close
in interfaceDataOperator
-
reload
public void reload()
Description copied from interface:DataOperator
Reload the operator afterRepository
has been changed.This method is called automatically, when
ConfigRepository.reloadConfig(boolean, cz.o2.proxima.typesafe.config.Config)
is called.- Specified by:
reload
in interfaceDataOperator
-
getStream
@SafeVarargs public final org.apache.beam.sdk.values.PCollection<StreamElement> getStream(org.apache.beam.sdk.Pipeline pipeline, Position position, boolean stopAtCurrent, boolean useEventTime, AttributeDescriptor<?>... attrs)
CreatePCollection
in givenPipeline
from commit log for given attributes.- Parameters:
pipeline
- thePipeline
to createPCollection
in.position
- position in commit log to read fromstopAtCurrent
-true
to stop at recent datauseEventTime
-true
to use event timeattrs
- the attributes to createPCollection
for- Returns:
- the
PCollection
-
getStream
@SafeVarargs public final org.apache.beam.sdk.values.PCollection<StreamElement> getStream(@Nullable java.lang.String name, org.apache.beam.sdk.Pipeline pipeline, Position position, boolean stopAtCurrent, boolean useEventTime, AttributeDescriptor<?>... attrs)
CreatePCollection
in givenPipeline
from commit log for given attributes.- Parameters:
name
- name of the consumerpipeline
- thePipeline
to createPCollection
in.position
- position in commit log to read fromstopAtCurrent
-true
to stop at recent datauseEventTime
-true
to use event timeattrs
- the attributes to createPCollection
for- Returns:
- the
PCollection
-
getBatchUpdates
@SafeVarargs public final org.apache.beam.sdk.values.PCollection<StreamElement> getBatchUpdates(org.apache.beam.sdk.Pipeline pipeline, AttributeDescriptor<?>... attrs)
CreatePCollection
from updates to given attributes.- Parameters:
pipeline
-Pipeline
to create thePCollection
inattrs
- attributes to read updates for- Returns:
- the
PCollection
-
getBatchUpdates
@SafeVarargs public final org.apache.beam.sdk.values.PCollection<StreamElement> getBatchUpdates(org.apache.beam.sdk.Pipeline pipeline, long startStamp, long endStamp, AttributeDescriptor<?>... attrs)
CreatePCollection
from updates to given attributes with given time range.- Parameters:
pipeline
-Pipeline
to create thePCollection
instartStamp
- timestamp (inclusive) of first update taken into accountendStamp
- timestamp (exclusive) of last update taken into accountattrs
- attributes to read updates for- Returns:
- the
PCollection
-
getBatchUpdates
@SafeVarargs public final org.apache.beam.sdk.values.PCollection<StreamElement> getBatchUpdates(org.apache.beam.sdk.Pipeline pipeline, long startStamp, long endStamp, boolean asStream, AttributeDescriptor<?>... attrs)
CreatePCollection
from updates to given attributes with given time range.- Parameters:
pipeline
-Pipeline
to create thePCollection
instartStamp
- timestamp (inclusive) of first update taken into accountendStamp
- timestamp (exclusive) of last update taken into accountasStream
- create PCollection that is suitable for streaming processing (i.e. can update watermarks before end of input)attrs
- attributes to read updates for- Returns:
- the
PCollection
-
getBatchSnapshot
public final org.apache.beam.sdk.values.PCollection<StreamElement> getBatchSnapshot(org.apache.beam.sdk.Pipeline pipeline, AttributeDescriptor<?>... attrs)
CreatePCollection
from snapshot of given attributes. The snapshot is either read from available storage or created by reduction of updates.- Parameters:
pipeline
-Pipeline
to create thePCollection
inattrs
- attributes to read snapshot for- Returns:
- the
PCollection
-
getBatchSnapshot
public final org.apache.beam.sdk.values.PCollection<StreamElement> getBatchSnapshot(org.apache.beam.sdk.Pipeline pipeline, long fromStamp, long untilStamp, AttributeDescriptor<?>... attrs)
CreatePCollection
from snapshot of given attributes. The snapshot is either read from available storage or created by reduction of updates.- Parameters:
pipeline
-Pipeline
to create thePCollection
infromStamp
- ignore updates older than this stampuntilStamp
- read only updates older than this timestamp (i.e. if this method was called at the given timestamp)attrs
- attributes to read snapshot for- Returns:
- the
PCollection
-
getAccessorFor
public DataAccessor getAccessorFor(AttributeFamilyDescriptor family)
- Parameters:
family
- descriptor of family to retrieve accessor for- Returns:
DataAccessor
for given family
-
getRepository
public Repository getRepository()
Description copied from interface:DataOperator
Retrieve repository associated with the operator.- Specified by:
getRepository
in interfaceDataOperator
- Returns:
Repository
associated with the operator
-
getDirect
public DirectDataOperator getDirect()
-
hasDirect
public boolean hasDirect()
-
-