Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce org.jpmml.evaluator.Transformer entry point #96

Open
christophe-rannou opened this issue Jan 25, 2018 · 8 comments
Open

Introduce org.jpmml.evaluator.Transformer entry point #96

christophe-rannou opened this issue Jan 25, 2018 · 8 comments

Comments

@christophe-rannou
Copy link

Hi,

I have a question which I am not really sure is PMML related or jpmml-evaluator related. I would like to serialize my preprocessing tasks through PMML. So far I succeded in parsing my preprocessing tasks using the TransformationDictionnary and to output the desired processed field through Output when coupled with a MiningModel (such as a tree).
Is it possible to have like an Identity MiningModel to return those field without needing a dummy model to evaluate the PMML ?

The following PMML sums up what I am trying to achieve (not valid since missing a functionName):

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<PMML xmlns="http://www.dmg.org/PMML-4_3" version="4.3" x-baseVersion="4.3">
    <Header>
        <Application name="MyApp"/>
    </Header>
    <DataDictionary>
        <DataField name="feat" optype="categorical" dataType="string">
            <Value value="A"/>
            <Value value="B"/>
            <Value value="C"/>
        </DataField>
    </DataDictionary>
    <TransformationDictionary>
        <DerivedField name="encoded_feat" optype="continuous" dataType="integer">
            <MapValues outputColumn="output">
                <FieldColumnPair field="feat" column="input"/>
                <InlineTable>
                    <row>
				<input>A</input>
				<output>0</output>
		    </row>
                    <row>
				<input>B</input>
				<output>1</output>
		    </row>
                    <row>
				<input>C</input>
				<output>2</output>
		    </row>
                </InlineTable>
            </MapValues>
        </DerivedField>
    </TransformationDictionary>
    <MiningModel>
        <MiningSchema>
            <MiningField name="feat"/>
        </MiningSchema>
        <Output>
            <OutputField name="final" optype="continuous" dataType="integer" feature="transformedValue">
                <FieldRef field="encoded_feat"/>
            </OutputField>
        </Output>
    </MiningModel>
</PMML>

Thanks

@vruusmann
Copy link
Member

vruusmann commented Jan 27, 2018

I would like to serialize my preprocessing tasks through PMML.

The "entry point" of the JPMML-Evaluator library is the org.jpmml.evaluator.Evaluator interface, which requires a backing model element. If your PMML document does not contain any model elements, then you cannot use this "entry point" and either 1) must devise and develop an alternative "entry point" interface (something like org.jpmml.evaluator.Preprocessor?) or 2) use an alternative library.

Is it possible to have like an Identity MiningModel to return those field without needing a dummy model to evaluate the PMML?

You cannot use the MiningModel element for that, because it is a wrapper around child model elements.

However, you can use the RegressionModel element to represent regression-type identity transforms. Just construct the following regression table: y = 1.0 * encoded_feat + 0.0

@vruusmann vruusmann changed the title Preprocessing only with jpmml-evaluator Introduce org.jpmml.evaluator.Preprocessor entry point Jan 27, 2018
@vruusmann
Copy link
Member

Reopening this issue, because I might want to do something about it in the upcoming 1.4.X development branch.

@vruusmann vruusmann reopened this Jan 27, 2018
@christophe-rannou
Copy link
Author

I would like to work on this, is there any pointers you could give me ?

@vruusmann
Copy link
Member

vruusmann commented Feb 1, 2018

You raised this issue, so you have a use case that needs addressing, not me.

The goal is to design an interface similar to org.jpmml.evaluator.Evaluator, but for preprocessors. The Evaluator interface encapsulates models, so it's dealing with model input, target and result fields; every field class has its own specification, etc.

The requirement here is to design a "preprocessor schema". It should have at least two schema query methods Preprocessor#getArgumentFields() and Preprocessor#getResultFields(), and the evaluate method Preprocessor#evaluate(Map<FieldName, Object>). Anyway, preprocessor's argument fields and result fields are functionally different from standard model fields.

@vruusmann vruusmann changed the title Introduce org.jpmml.evaluator.Preprocessor entry point Introduce org.jpmml.evaluator.Transformer entry point Feb 4, 2018
@jqueguiner
Copy link

@christophe-rannou : pushing code ? ;-)

@vruusmann
Copy link
Member

Opened a "request for clarification" at DMG.org's issue tracker to have the entry/exit interfaces of transformer-only PMML documents specified:
http://mantis.dmg.org/view.php?id=228

@ZhejunWu
Copy link

Hi,
I'd like to ask is this issue resolved? I saw this related PR: #116 was closed instead of merged. I wonder if we have any workaround to use transformer-only pmml. Could you please advise?
Thanks!

@vruusmann
Copy link
Member

I wonder if we have any workaround to use transformer-only pmml.

The PMML specification does not define such a workflow.

I've asked DMG.org to clarify the situation, but there's been no official response yet (typically takes 2-3 years to obtain it):
http://mantis.dmg.org/view.php?id=228

The JPMML software project can always do a vendor extension. But I don't have a clear use case to base my work upon.

I saw this related PR: #116 was closed instead of merged.

We don't do copy&paste programming in this project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants