Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

To Do List #18

Open
5 of 17 tasks
SixiangHu opened this issue Sep 11, 2015 · 1 comment
Open
5 of 17 tasks

To Do List #18

SixiangHu opened this issue Sep 11, 2015 · 1 comment
Assignees

Comments

@SixiangHu
Copy link
Owner

SixiangHu commented Sep 11, 2015

  • boxplot for segmentation outlier
  • residual to tree methods (check variable missing)
  • single Tree visualisation on 2 dimension split, which can compare with different tree methods.
  • Travis.yml update to R template from current c template
  • EDA report
  • Quick Xgboost run for important variable
  • Quick abnormal detection run

Done:

Not Now:

  • Correlation matrix calculation for numerical matrix or data frame. Or find a quicker way for correlation calculation on big data. [correlation calculation on numerical dataset #24]
  • Distributed / parallel Calculation
  • Need a model comparison function for all ML methods. A structure or template is needed to conduct this model comparison. A PMML structure would be a good starting point, even though not all ML packages support it. Or use caret package straight away.
  • which terms in the model have been affect by new factors and in what way?
    1. Coefficient
    2. Changes by level (vis)
    3. Is it because of correlation?
    4. What this new factor has explained?
      This is more like feature selection, and may to specific to glm / lm method.
  • Trends analysis:
    1. Within the model to assess the consistency
    2. Between datasets on different times to assess the development
    3. There should have a function to give this feature. A starting point is the AbnormalDetection package and changepoint package.
      More timeseries analysis. Cannot find specific / achievable criteria.
  • historgram of deviance residual (symmetry). (REASON: currentl resiPlot has contour plot for this.)
  • deviance residual (sqrt(weighted deviance) * sign(a-e)). (REASON: different residual function for resiPlot #52 provides an open slot for this.)
  • deviance plot for outlier visualisation. (REASON: currentl resiPlot has contour plot for this.)
  • support multinomial analysis. (REASON: not urgent need.)
  • RcppParalle for CramersV function (REASON: not urgent need. And parallel overheads may have inverse effect.)
@SixiangHu SixiangHu reopened this Sep 18, 2015
@SixiangHu
Copy link
Owner Author

@SixiangHu SixiangHu self-assigned this Nov 22, 2021
@SixiangHu SixiangHu pinned this issue Nov 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant