Documentation of exe_kg_lib.classes.tasks package
Overview¶
This package contains classes that correspond to entities of type owl:class that are either rdfs:subClassOf ds:Task or rdfs:subClassOf ds:AtomicTask in the KG. In either case, these entities are in the top-level of the Task hierarchy for each of the three KG schemata: ML, Statistics and Visualization.
This package's classes implement the abstract run_method() to perform the following steps:
- The input data are taken:
- Either from outputs of previous Tasks (parameter:
other_task_output_dict) of the ExeKG - Or a given dataframe (parameter:
input_data) that holds the input data for the ExeKG
- Either from outputs of previous Tasks (parameter:
- An algorithm is executed. The algorithm can be related to ML, Statistics or Visualization, depending on the Python file's prefix (i.e.
ml,statistic,visual). The algorithm can:- Either be implemented as part of this library
- Or belong to an external module. In this case, the module is determined using
classes.tasks.task.Task.resolve_module()based on the Task'smethod_module_chain. See section Naming conventions for more info onmethod_module_chain.
- The output of the algorithm is returned as a dictionary with pairs of output name and value
Naming conventions¶
The below naming conventions are necessary for automatically mapping KG's tasks (with methods and properties) to Python objects while parsing the ExeKG.
- Each class name in this package is the name of an
owl:classthat is eitherrdfs:subClassOf ds:Taskorrdfs:subClassOf ds:AtomicTask. - The
method_params_dictandmethod_inherited_params_dictfields inherited fromclasses.tasks.task.Taskcontain parameters for the algorithm to be executed.- Their keys are produced by applying
utils.string_utils.property_iri_to_field_name()to the datatype property names of the Task's linkedds:AtomicMethodinstance in the ExeKG. E.g. a key namedsplit_ratiocorresponds tohasParamSplitRatioproperty in the KG. - Their values are produced by applying
classes.exe_kg_mixins.exe_kg_execution_mixin.ExeKGExecutionMixin._literal_to_field_value()to the literal values of the datatype properties in the ExeKG. E.g. a value of0.6corresponds to"0.2"^^xsd:floatliteral value in the KG.
- Their keys are produced by applying
-
The
method_module_chainfield inherited fromclasses.tasks.task.Taskcontains a hierarchy list of Python module names from top to bottom.- The module hierarchy is determined by
utils.query_utils.get_module_hierarchy_chain()starting from the Task's linkedds:AtomicMethodinstance in the ExeKG, and proceeding via therdfs:subClassOf+ ds:Moduleproperty path. - Each item in the hierarchy list (except for the last one) comes from the name of a
owl:classthat is ardfs:subClassOf+ ds:Module, after conversion byutils.string_utils.class_name_to_module_name(). - The last item of the list comes from the type of the Task's linked
ds:AtomicMethodinstance, after conversion byutils.string_utils.class_name_to_method_name().
The below example shows the module chain
SVCMethod -> SvmModule -> SklearnModulewhich leads tomethod_module_chain = ["sklearn", "svm", "SVC"].############################# ### START: ExeKG fragment ### ############################# ml:BinaryClassification1 a ml:BinaryClassification ; ds:hasNextTask ml:Test1 ; ml:hasBinaryClassificationMethod ml:SVCMethod1 ; ml:hasTrainInput ml:DataInTrainX_BinaryClassification1_1, ml:DataInTrainY_BinaryClassification1_1 ; ml:hasTrainOutput ml:DataOutTrainModelSVCMethod . ########################### ### END: ExeKG fragment ### ########################### ################################# ### START: KG schema fragment ### ################################# ml:SVCMethod a owl:Class ; rdfs:subClassOf ds:AtomicMethod, ml:SvmModule, ml:TrainMethod . ml:SvmModule a owl:Class ; rdfs:subClassOf ml:SklearnModule . ml:SklearnModule a owl:Class ; rdfs:subClassOf ds:Module . ############################### ### END: KG schema fragment ### ############################### - The module hierarchy is determined by
-
The
inputsandoutputsfields inherited fromclasses.tasks.task.Taskcontain a list ofclasses.data_entity.DataEntityobjects. In the case ofinputs, the objects can also be of typeclasses.method.Method. The objects are generated by invoking the method_property_value_to_field_value()from theExeKGExecutionMixinclass. This method is applied to instances that are linked through a subclass of eitherds:hasInputords:hasOutputto the Task in the ExeKG.- The field names of
DataEntityandMethodobjects are filled by applyingutils.string_utils.property_iri_to_field_name()to the properties of the Task's linkedds:DataEntityords:Methodinstances. - In the case of input DataEntities, the object's fields are filled using the properties of the
ds:DataEntityinstances that are referenced by the Task's linkedds:DataEntityinstances.
The below example shows a
LinePlotting1task instance that hasDataInToPlot_LinePlotting1_1as input.DataInToPlot_LinePlotting1_1referencesds:feature_1. So, in this case, theinputsfield of the corresponding Task Python object will contain a DataEntity object with fields:source = "feature_1",reference = IRI(ds:feature_1). The fieldsdata_semanticsanddata_structureare mainly used during pipeline construction.###################### ### ExeKG fragment ### ###################### visu:LinePlotting1 a visu:LinePlotting ; ds:hasNextTask visu:LinePlotting2 ; visu:hasLinePlottingMethod visu:PlotMethod1 ; visu:hasPlottingInput visu:DataInToPlot_LinePlotting1_1 . visu:DataInToPlot_LinePlotting1_1 a visu:DataInToPlot ; ds:hasReference ds:feature_1 . ds:feature_1 a ds:DataEntity, ds:Numerical, ds:Vector ; ds:hasSource "feature_1"^^xsd:string . - The field names of